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Introduction 


The  goal  of  this  work  was  to  study  and  hopefully  compare  in  a 
precise  way  the  various  techniques  for  proving  properties  of  programs 
existing  in  the  literature.  It  soon  turned  out  that  nothing  interesting 
could  be  said  if  one  did  not  state  precisely  what  the  various  methods 
really  are  within  a  common  logical  system.  A  perfectly  adequate  system 
for  doing  so  was  the  Logic  for  Computable  Function  of  Milner  [18],  which 
is  based  on  the  work  of  Scott  [29]  and  [30]. 

In  this  framework,  proof  techniques  fall  rather  nicely  into  two 
classes:  for  the  first  class,  which  includes  the  methods  of  Burstall  [lj, 
Floyd  [7],  Hoare  [9],  Manna-Pnueli  [16],  the  semantics  needed  for  validating 
the  techniques  only  demand  that  programs  be  interpreted  as  monotone 
functions  in  the  sense  of  Scott  [29];  for  methods  in  the  second  class, 
such  as  those  of  Scott  [30]  and  Morris  [23],  programs  must  be  interpreted 
as  continuous  functions. 

The  methods  in  the  second  class  are  then  "more  powerful"  in  that 
they  can  be  used  for  justifying  the  other  techniques;  furthermore, 
provided  that  all  methods  are  expressed  within  the  same  logical  system, 
we  can  exhibit  properties  of  programs  which  are  provable  with  the 
proof-techniques  in  the  second  class,  and  not  provable  with  the  techniques 
in  the  first  class,  and  not  vice-versa. 

Before  studying  the  various  proof  techniques,  we  present  a  minimal 
background  in  Scott’s  Theory  of  Computation  in  Chapter  1.  One  of  the 
points  of  the  theory  which  we  thought  needed  clarification  was  the 
relations  between  the  abstract  notion  of  least  fixed-point  and  the 
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concrete  notion  of  trace  of  a  program.  Chapter  2,  which  is  the  most 
original  part  of  this  thesis,  is  devoted  to  this  question.  We  believe 
that  Theorems  1,  3  and  U  are  new  while  Theorem  2  is  a  generalization 
of  a  result  by  Cadiou  [2]. 

In  Chapter  3,  we  study  the  proof-technique  in  the  first  class.  The 
formal  system  used  is  original,  althovph  a  mere  adaptation  of  Milner's 
LCF  to  a  different  semantic  domain.  Reduction  of  the  proof  techniques 
presented  to  the  rule  of  fixed-point  induction  are  due  to  Park  [26]. 

In  Chapter  U,  we  describe  reductions  of  some  methods  to  the  rule 
of  induction  of  Scott  [30];  some  of  these  reductions  are  also  used, 
implicitly  or  explicitly  in  deBakker -Scott  [6],  Scott  [30],  Milner  [18], 
and  Milner-Weyrauch  [  21  ] . 
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Chapter  1.  SCOTT'S  THEORY  OF  COMPUTATION 


In  this  chapter,  we  shall  present  an  overview  of  Scott's  theory 
of  computation,  whose  goal  was  to  give  a  "mathematical"  as  opposed  to 
"operational"  semantics  for  high-level  programming  languages.  Only  the 
parts  of  the  theory  which  are  relevant  to  this  dissertation  will  be 
described.  In  particular,  one  of  Scott's  most  impressive  achievements 
was  to  construct  a  model  for  the  ^-calculus,  which  in  turn  provides  a 
mathematical  semantics  for  programming  peculiarities  such  as  self -modifying 
machine  codes  or  procedures  taking  other  procedures  as  arguments.  We 
shall  not  concern  ourselves  with  this  problem,  and  the  kind  of  procedure 
we  are  willing  to  consider  has  a  definite  type  —  a  function  from 
individuals  tc  individuals,  or  a  functional  from  functions  to  functions, 
etc.  Limited  as  it  is,  the  theory  that  we  shall  describe  is  nevertheless 
powerful  enough  not  only  to  describe  the  semantics  of  non-trivial  subsets 
of  any  programming  language,  but  also  to  justify  all  the  existing  proof 
techniques  for  those  languages.  The  presentation  of  this  chapter,  whose 
only  purpose  is  to  make  the  thesis  more  or  less  self-contained,  is  based 
on  Scott  [29]  except  for  some  minor  technical  details. 

We  assume  that  the  reader  has  some  knowledge  of  elementary  lattice 
and  recursion  theories . 


1.  Data  Types 


As  a  first  step,  let  us  consider  some  examples  of  what  one  would 
like  to  call  data  types: 

(a)  the  boolean  values  true  and  false; 

(b)  the  set  of  integers; 

(c)  the  n-dimensional  arrays  of  integers; 

(d)  the  set  of  subsets  of  integers; 

(e)  the  set  of  computable  partial  functions  over  some  data-type; 

(f)  the  set  of  non-negative  real  numbers. 

Some  of  those  sets  contain  as  elements  objects  like  total  functions  or 
irrational  real  numbers,  which  we  shall  call  "infinite  elonents" .  They 
cannot  be  described  entirely,  but  one  can  give  better  and  better  finite 
approximations  to  what  they  really  are.  For  example,  the  intervals 
[3,1+]  ,  [3. 1,3. 2]  ,  [3 .lU,3-15]  ,  ...  form  a  sequence  of  approximations 

Of  TT  ■ 

This  suggests  that  data -types  ought  to  be  partially  ordered  sets. 
The  notation  xcy  means  that  x  approximates  y  ,  and  c  must 
therefore  be  a  reflexive,  transitive  and  antisymmetric  relation  over 
the  data-type.  For  example,  if  A  and  B  are  some  subsets  of  the 

integers,  ACB  means  that  A  is  a  subset  of  B  .  Similarly,  for 

any  two  intervals  [x,x»  ]  and  [y,y* ]  of  non-negative  real  numbers 
[x,x'  ]  c  [y,y*  ]  will  mean  that  x  <  y  and  y'  <  x'  ,  i.e.,  [y,y*  ] 

gives  us  a  better  idea  of  where  the  real  number  lies  than  [x,x»  ]  . 

Considering  now  two  integers  k  and  f  ,  we  do  not  wish  to  say 
that  one  is  an  approximation  of  the  other.  However,  it  may  be  the 
case  that  k  is  not  explicitly  known,  but  has  to  be  determined  as 

h 


the  result  of  some  computation.  As  we  all  know,  this  computation  may 
never  terminate,  in  which  case  k  is  said,  to  be  undefined;  we  denote 


this  by  k  =  UU  and  clearly  UIJ  c:  l  for  any  l  .  We  use  a  different 
equality  sign  "  =  "  in  order  to  avoid  confusions  with  the  regular* 
equality  "  =  "  over  the  integers.  Here,  x  =  y  means  that  x  C  y 
and  y  C  x  ,  while  x  =  y  is  true  whenever  x  and  y  are  the  same 
integer.  For  example,  1  =  1  and  1=1  are  both  true,  while  UU  =  1 
is  false  and  UU  =  1  is  undefined.  To  be  precise,  one  should  write 
(UUT  =  1)  =  UU  where  the  subscripts  are  here  to  remind  us  that  UU 
is  an  undefined  integer,  while  IRL,  is  an  undefined  boolean. 

D 

To  clarify  those  ideas,  it  is  helpful  to  describe  more  precisely 
the  partial  orderings  over  our  favorite  data  types. 

TT  FF 

(a)  For  the  boolean  values,  the  data  type  looks  like  v.  v*  , 

,  UU^ 

b 

where  means  that  b  covers  a  ,  i .  e . ,  a  a.  b  with  a  f.  b 

a 

and  a  c:  c  xz  b  for  come  c  implies  either  a  =  c  or  c  =  b  . 

(b)  Although  there  are  infinitely  many  integers,  the  corresponding 
data  type  is  not  much  richer: 

1  2  ...  n  ... 

s 

UU 

Data  types  of  this  kind,  where  elements  are  either  completely  specified 
or  undefined  will  be  called  discrete. 


(c)  The  data  type  of  pairs  of  Boolean  has  already  a  richer 
structure: 

<TT,TT>  (TT,FF>  <FF,TT)  <FF,FF) 


(d)  In  the  data  type  of  subsets  of  some  set,  AcB  means  that  A 
is  a  subset  of  B  ;  the  least  element  UU  is  the  empty  set. 

(e)  As  indicated  before,  the  elements  of  the  data  type  of  real 
numbers  are  closed  intervals  [x,x* ]  with  0  <  x  <  x'  and 

[x,x*  ]  c  [y,yf  ]  whenever  x  <  y  and  y'  <  x’  .  It  is  convenient 
to  complete  the  real  line  with  an  element  °°  ,  thus  allowing  [ 7 . 1 ,  =°] 
for  example,  to  be  a  real  number.  The  interval  [0,oo]  reflects  a 
complete  lack  of  information  and  should  therefore  be  identified  with 
the  undefined  real  UU  . 

(f)  If  £  is  a  data  type  partially  ordered  by  c  ,  the  partial 
functions  mapping  £  into  £  are  ordered  by: 

f  c  g  iff  f(x)  c  g(x)  for  all  x  in  £  . 

—  — £ 

The  minimal  element  UU^  is  the  partial  function  which  is  everywhere 
undefined,  i . f . ,  UU(x)  s  uu  for  all  x  in  £  . 

Infinite  Elements  as  Limits 

Let  us  contemplate  again  the  sequence 
f3,M  >  (3-1  >  3-2]  ,  [ ;; . l4  ,  3.15]  ,  •  We  would  like  to  be  able  to 


define  n 


as  the  "limit"  of  these  intervals.  Abstractly,  this  will 


*/ 

require  that  any  chain 


X  _  C  X1  d  .  .  .  d  X.  d  X .  ..  d  ... 

0  -  1  -  -  i  -  l+l  - 

has  a  limit  y  in  the  data  type  3  >  which  is  the  least -upper  bound 

of  the  x.'s  ,  that  is,  x.  d  y  for  every  j  and,  for  any  z  in  the 
1  J 

data  type,  x .  d  z  for  every  j  implies  y  d  z  .  We  write  y  =  LI  x. 

J  “  i  >0 


According  to  this  notation,  in  the  data- type  of  real  numbers 

[1,2]  =  U  [ i/(i+l) , (2i+l)/i ]  and  for  sets  of  integers, 
i  >0 

{k|k  is  odd]  =  U  {1,3, . . .,2i+l]  .  Let  us  define  the  constant 
i  >0 

function  one  as  one(x)  s  1  for  any  integer  x  ,  while  one(uu)  =  UU  ; 
this  function  can  also  be  defined  as  a  limit  of  partial  functions 

one  a  U  [\x.  if  x  <  i  then  1  else  UU] 

i  >0 


Computability 

Asking  that  the  infinite  object  U  x.  be  computable  will 

i  >0 

require  that  the  x^  themselves  be  computable.  We  therefore  postulate 
the  existence  of  an  effectively  given  subset  E  of  the  data  type  J}  , 
such  that  any  element  of  3  is  the  limit  (not  necessarily  effective) 
of  some  chain  of  elements  of  E  .  Such  a  set  E  will  be  called  a 
recursive  basis  of  3  •  For  example,  a  data-type  in  which  there  are 
no  infinite  ascending  chains  (booleans,  integers,  arrays)  is  its  own 


tf  strictly  speaking,  we  only  need  denumerable  chains  to  have  a  limit. 
However,  when  data-types  have  a  denumerable  basis  (see  below), 
requiring  that  countable  chains  have  limits  implies  that  any  chain 
(and  in  fact  directed  set)  also  has  a  limit. 
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basis  provided  that  3t  is  recursive.  The  finite  sets  of  integers 
constitute  a  basis  for  the  set  of  subsets  of  the  integers.  Similarly, 
the  set  of  functions  which  are  undefined  for  all  but  a  finite  number 
of  arguments  is  a  basis  for  the  data  type  of  partial  functions. 

Finally,  a  basis  for  the  real  numbers  is  the  set  of  rational-end-point 
intervals . 

We  can  remark  that  the  recursive  basis  of  a  data  type  Jft  must  be 
denumerable.  Consequently,  all  of  its  elements  being  obtained  as 
limits  of  denumerable  chains  in  the  basis,  J)  itself  has  at  most  a 
continuum  number  of  elements.  In  particular,  since  there  are  at  most 
denumerably  many  computable  objects  (i.e.,  objects  defined  as  limits  of 
effectively  given  chains),  a  non -denumerable  data-type  will  possess 
many  non -computable  elements . 

We  can  summarize  the  above  discussion  by  the  postulate 

A  data-type  is  a  partially  ordered  set  with  a 
minimal  element,  possessing  a  recursive  basis 
and  in  which  every  ascending  chain  has  a  limit.  ; 


Note:  This  notion  of  data-type  is  slightly  different  from  the  one 

advocated  by  Scott  [29],  namely  that  data-types  ought  to  be  complete 
lattices.  The  main  technical  reason  for  this  choice  was  the  difficulty 
which  seems  to  o.rise  for  defining  our  notion  of  sequential  function 
in  Chapter  2,  with  complete  lattices . 
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2.  Computable  Functions  over  Data  Types 

The  next  step  is  to  consider  programs  as  functions  mapping  data 
types  into  data  types,  and  to  derive  some  mathematical  properties  of 
such  functions . 

Programs  as  Monotone  Mappings 

Let  f  be  a  partial  function  computed  by  some  program.  Whenever 
the  input  x  is  less  defined  than  the  input  y  ,  the  output  f(x)  must 
be  less  defined  than  f(y)  ,  i.e.,  x  C  y  implies  f(x)  C  f(y)  .  This 
motivates  the  hypothesis  that  functions  computed  by  programs  are  monotonic 
mappings  over  the  data  type. 

Examples 

—  The  successor  function  [\x.  x+1]  over  the  integers  is  monotone 
if  we  choose  UU+1  2  UU  . 

—  The  conditional  if  p  then  x  else  y  where 
if  UU  then  x  else  y  s  UU 
if  TT  then  x  else  y  =  x 
if  FF  then  x  else  y  =  y 

is  monotone  with  respect  to  p  ,  x  and  y  .  (A  function  of  several 
variables  is  monotone  when  it  is  monotone  in  each  of  its  arguments .) 

—  As  for  sets,  the  functions  A  U  B  and  A  fl  B  are  both  monotone 
in  A  and  B  . 

—  The  following  definition  of  divi  uion  over  the  reals  makes  it 
a  monotone  function: 
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[x,y]  /  [x*,y*  |  =  [  fr>^T  I 

—  0  and  ~  -  ®  for  all 


where 
xi  1 0,^1  . 


Programs  as  Continuous  Mappings 

> 

As  it  stands  now,  the  theory  is  already  quite  adequate  for 
expressing  and  proving  properties  of  programs,  and  Chapter  3  describes 
some  results  which  can  be  derived  from  the  assumption  that  mappings 


between  data-types  are  monotone  functions . 

However,  we  are  still  missing  an  essential  property  of  computable 
functions .  Knowing  the  values  of  a  monotone  function  over  the  basis  of 
a  data-type  dees  not  determine  in  general  its  values  over  the  data-type. 
For  example,  the  function 


funny -union  ( A ,  B ) 


f A  U  B  if  A  or  B  is  finite 
|  H  if  A  and  B  are  infinite 


where  A  and  B  are  two  subsets  of  H  ,  is  monotone  but  clearly  not 
computable . 

Intuitively,  the  value  f(x)  of  a  computable  function  f  at  an 
infinite  object  x  should  be  obtained  as  the  limit  of  the  values 

f(x  )  over  the  finite  approximation  x.  of  x  .  More  precisely,  let 

v  i  1 

us  consider  an  arbitrary  chain 

e  z  e.  c  •■•he  he,'—... 

0  -  1  -  -  n  -  n+1  - 

of  elements  in  the  basis  of  she  data  type.  Since  f  is  monotone,  the 
set  [i  >  0  |  f(ei) }  is  also  a  chain 

f(e  )  =  ftep  c  ...  =  ««„)  =  «Vi)  - 
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and  the  computability  of  f  demands  that 


f(  U  e  )  e  U  f(e  )  (a) 

n  >0  n  >0 

A  monotone  function  satisfying  equation  (a)  for  arbitrary  chains  will 
be  called  continuous .  We  shall  therefore  postulate  that 


Computable  functions  are  continuous  mappings  between 
data -types . 

Again,  a  function  of  several  arguments  is  continuous  if  it  is  continuous 
in  each  of  its  arguments . 

Examples 

—  The  function  [^p,x,y.  if  p  then  x  else  y]  is  continuous. 

Addition  of  two  integers,  union  of  two  sets,  division  of  reals  are  also 
continuous  operations.  The  functional  [\F.[\x.  if  x  -  0  then  1  else  x.F(x-l)]] 
over  the  data-type  of  natural  numbers  is  continuous,  both  in  F  and  in  x  . 

—  Let  us  define  the  mappings  3x  p(x)  and  Vx  p(x)  which  associate 
a  boolean  to  each  function  p  from  natural  numbers  to  booleans  as 
follows: 

—  3x  p(x)  is  equal  to  TT  if  p(n)  =  TT  for  some  natural 

number  n  and  equal  to  UU  otherwise. 

—  Vx  p(x)  is  equal  to  TT  if  p(n)  =  TT  for  all  natural 

numbers  n  ^  UU  and  equal  to  UU  otherwise. 

We  shall  verify  that  [\p. (3x)p(x) }  is  continuous  while  [\p. (Vx)p(x) ] 

is  monotone  but  not  continuous  in  general.  Let  Pq  E  •  •  •  E  P-^  E  ^i+i  E  •  •  • 

11 

•e******1***^^ .  .  ■ _ 


be  a  chain  of  partial  predicates  over  the  natural  numbers.  We  easily 

verify  that  (  U  p.)(x)  -  LI  (p.  (x))  .  Now,  if  (  U  P*)(x)  = 
i  >0  1  i  >0  i  >0 

U  p.(x)  h  TT  for  some  x  ,  there  must  exist  an  iQ  such  that  i  >  iQ 
i  >0  1 

implies  p.(x)  =  TT  ;  otherwise,  either  (  J  p.)(x)  =  FF  and  again  there 
1  i  >0 

is  an  i-  such  that  p.  (x)  2  FF  or  (  J  p.  )(x)  =  UU  and  p.  (x)  2  UU 
0  10  i  >0 

for  all  i  .  In  all  cases  we  have  (3x)(  J  p.)(x)  2  j  (3x)p.  (x)  and 

i  >0  1  i  >0 

3  is  indeed  continuous.  One  shows  that  V  is  monotone  in  a  similar  way 
and  the  chain  p^x)  s  (x  <  i)  provides  a  counterexample  to  the  continuity 
of  V  • 

Let  us  now  discuss  some  properties  of  continuous  functions.  First 
of  all,  it  is  possible  to  define  a  topology  over  data-types  such  that  a 
function  is  continuous  in  the  above  sense  if  and  only  if  it  is  continuous 
in  the  topological  sense  (see  Scott  [31]).  Without  describing  the 
topology,  we  can  nevertheless  say  that  a  subset  X  of  the  data-type  3 
is  directed  if  for  all  Xjy^X  ,  there  exists  a  zeX  such  that  x  Cj  z 
and  y  c  z  .  Together  with  the  existence  of  a  denumerable  basis  for  3  > 
the  fact  that  continuous  functions  preserve  limits  of  denumerable  chains 
implies  that  continuous  functions  also  preserve  least -upper-bounds  of 
directed  sets.  Continuous  functions  do  not  however  preserve  least-upper- 
bounds  or  greatest-lower-bounds  (when  they  exist)  of  arbitrary  sets. 
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3 .  Fixed.  Points 


Let  f  be  a  function  over  a  data-type  .  We  say  that  xe.fr  is 
a  fixed-point  of  f  if  x  =  f(x)  ;  we  say  that  y  is  the  least -fixed- 
point  of  f  if  y  2  f(y)  and  y  c  x  for  any  other  fixed-point  x  . 

Note  that,  whenever  it  exists,  the  least -fixed- point  of  f  must  be 
unique;  we  shall  denote  it  either  by  iix.f(x)  or  by  x^.  . 

Theorem  (Kleene) .  Any  continuous  function  over  a  data-type  ft  has 
a  least -fixed-point  x^  and 

x  s  U  f*(UU) 
n  >0 

Proof.  Here  f^(UU)  means  f(f( .  . .  (f(UU)) . . .)  (n  times)  and,  by 

monotonicity  of  f  ,  the  set  (fn(UU)}  for  n>0  is  indeed  a  chain.  We  first 

prove  that  U  f^UU)  is  a  fixed  point  of  f  .  This  is  easy  since 
n  >0 

f(  ll  f^UU))  =  d  fn+1(UU)  =  U  f^UU)  by  continuity  of  f  . 
n>0  n>0  n>0 

We  now  prove  that  U  f^UU)  must  be  minimal.  Let  y  be  an 

n  >0 

arbitrary  fixed-point  of  f  ,  i.e.,  y  =  f(y)  .  It  is  easy  to  prove  by 

induction  that  f^UU)  c  y  for  any  n  .  The  conclusion  U  ±(UU)  c  y 

n  >0 

follows  immediately. 

□ 


Examples 

—  In  any  data  type,  UU  =  [y,y.y]  and  x  h  [p,y.x]  . 

If  t  =  \f.[Xx.  if  x  =  0  then  1  else  x.f(x-l)  ] 

and  d  s  \f.[\x.  if  x  >  100  then  x-10  else  f(f(x+ll))  ]  over  the 

natural  numbers, 
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then 


<rn+1(UU)  =  [\x.  if  x  <  n  then  xl  else  UU] 

and.  an+1(UU)  s  [Kx.  if  x  >  100  then  x-10 

else  if  x-100  >  -n  then  91  else  UU]  ; 

therefore,  f  h  [\x.xl  ]  and  tQ  =  [Xx.  if  x  >  100  then  x-10  else  9U 

From  these  examples,  the  reader  may  already  suspect  that  there 
must  be  a  relation  between  recursively  defined  functions  and  least 
fixed  points.  The  next  chapter  will  be  entirely  devoted  to  this 

question. 
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Chapter  2.  FIXED-POINTS  AND  RECURSION 


The  object  of  this  chapter  is  to  detail  the  connections  between 
fixed -points  of  continuous  functionals  and  recursively  defined  functions 
in  a  very  simple  programming  language.  We  first  illustrate  that  the 
semantics  of  recursively  defined  functions  will  depend  on  the  implemen¬ 
tation.  A  careless  implementation  of  recursion  will  introduce  unnecessary 
computations,  which  may  even  prevent  the  program  from  terminating. 

A  general  criterion  for  the  correctness  of  an  implementation  will  be 
proved.  We  then  describe  an  implementation  of  recursion  which  is  both 
correct  and  optimal  in  a  general  class  of  sequential  languages  and 
therefore  constitutes  an  attractive  alternative  to  both  "call  by  value" 
and  "call  by  name". 


1.  Computations  of  Recursively  Defined  Functions 

Before  defining  a  computation  rule,  we  must  describe  two  programming 
languages,  lang  S  and  lang  P  .  Although  those  two  languages  were 
chosen  for  their  extreme  simplicity,  their  use  of  recursion  is  as  general 
as  any,  and  the  results  of  this  chapter  provide  some  insight  into 
semantics  and  implementation  of  more  complex  programming  languages. 

Lang  S  permits  only  sequential  computations,  and  corresponds 
precisely  to  a  certain  "typed"  subset  of  Algol  or  LISP. 

Lang  P  requires  some  parallel  operations,  and  thus  departs  from 
more  classical  programming  languages,  although  we  could  undoubtedly 
write  an  interpreter  for  lang  P  in  any  of  those  classical  languages. 


1.1  Description  of  Ian, 


and  lang  P 


Syntax 

Both  languages  have  the  same  syntax: 

(program)  :  F(X1> • • . ,Xn)  <=  (term) 

(term)  :  :=  A^A^I  •  •  • 


jG1(  (term  1),  ...,(term  p^) 

|g^(  (term  l),...,(term  pR)) 

|F((term  l),...,(term  n)) 

We  limited  ourselves  to  a  single  recursive  equation,  the  extension 
of  the  results  in  this  chapter  to  systems  of  mutually  recursive 
equations  being  straightforward. 

Here,  A, ,A„, . . . ,G, ,  . . . ,G.  denote  fixed  constants  and  functions 
X  d.  X  K. 

respectively.  It  is  convenient  to  use  a  more  standard  syntax,  e.g., 
F(X)  <=  IF  X  =  0  THEN  1  ELSE  X.F(X-l)  instead  of 

f(x)  <=  g1(p1(x,a0),a1,g2(x,f(g5(x))))  . 

The  meaning  of  a  program  will  be  a  continuous  mapping  in 
[J^x  . . .  x^n  -  £\  where  each  and  £  are  some  data-types ;  for 

simplicity,  the  it's  will  be  identical  to  3  unless  explicitly 
specified. 

Semantics  of  terms  in  lang  P 

The  meaning  of  a  (term)  is  a  (continuous)  functional 
\f .Ax1, . . .,xn^(  (term))  where  the  semantic  function  J  is  defined 
inductively  as  follows : 

(i)  =  a.  where  a.  fir 

1'  l  l 
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(ii)  J(X)  =  x. 

(iii)  4Gk(<term  1>, (terra  pk»  s  i-,.k(^(  (tern  1)) ,  . . .  (term  pR»  ) 

p, 

where  gk  is  some  continuous  function  in  [ 3  -  .£]  . 

(iv)  ^(F(^term  l),...,(term  n)))  =  f(^((term  1)) , . . . ,j( (term  n)))  . 

Here  we  have  to  prove  that  this  is  continuous,  i.e.,  that  continuous 
functions  are  closed  under  composition,  \-abstraction  and  fixed-point 
operation.  The  reader  can  find  these  proofs  either  in  Scott  [30]  or  in 
Milner  [ 19 ] - 

Semantics  of  Terms  in  lang  S 

The  semantics  of  lang  S  is  defined  in  precisely  the  same  way  as 
that  of  lang  P  ,  the  difference  lying  in  restrictions  on  the  interpreta¬ 
tion  of  base  functions.  In  lang  S  ,  we  require  functions  to  be  sequential, 
i.e.,  roughly  that  their  arguments  can  be  computed  in  sequence.  We  shall 
give  later  a  precise  definition  of  this  notion.  For  expository  purposes, 
however,  we  shall  limit  ourselves  for  the  moment  to  studying  a  particular 
sequential  language. 

The  data-types  on  which  our  particular  lang  S  is  computing  are 
discrete,  i.e.,  they  look  like: 


In  what  follows,  we  use  u>  instead  of  uu^  and  Q  in  place  of  uu_^ 
in  order  to  help  the  eye  avoid  type  confusions.  Among  the  base  functions, 
we  point  out  a  particular  one,  denoted  IF-THEN-ELSE  whose  interpretation 
is  the  usual  conditional,  i.e.,  if  uu  then  x  else  y  =  u>  , 
if  tt  then  x  else  y  =  x  and  if  ff  then  x  else  y  =  y  . 
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All  other  base  functions  are  required  to  be  strict,  i.e., 

g^( . .  .  ,uj, )  =  u)  :  they  are  undefined  as  soon  as  at  least  one  of  their 

arguments  becomes  undefined.  They  are  meant  to  correspond  to  the 
"hardware”  functions:  add  ,  addone  ,  test-for-equality  ,  ...  . 

It  will  be  shown  that  all  functions  definable  in  lang  S  are 
sequential.  The  symmetric  OR  defined  by  the  table: 


'xS\s 

uu 

tt 

ff 

uu 

uu 

tt 

uu 

tt 

tt 

tt 

tt 

ff 

uu 

tt 

ff 

or  the  symmetric  multiply  *  where  0*x  =  x*0  =  0  are  not  sequential, 
and  are  therefore  not  definable  in  lang  S  ,  nor  in  Algol  for  that  matter. 

Semantics  of  Programs  in  both  lang  S  and  lang  P 

The  functional  t  a  \f  .Xx^  . .  .,xn^(  (term))  as  defined  in  lang  S 
or  lang  P  can  be  shown  to  be  continuous.  It  must  therefore  have  a 
least  fixed-point  fT  and  it  would  be  nice  to  define  the  meaning  171  of 
the  corresponding  program  as  $!(  (program))  =  f  . 

This  is  unfortunately  not  true  for  all  implementations  of  recursion, 
and  our  goal  will  be  to  characterize  the  implementations  for  which  the 
computed  function  is  equal  to  this  least  fixed-point. 

1.2  Conventions  and  Notations 

The  reader  has  already  noticed  that  syntactic  entities  are  denoted 
by  upper  case  letters,  while  the  associated  semantic  objects  are 
represented  by  the  corresponding  lower-case  letters.  We  shall  keep  this 
convention  throughout  this  chapter.  For  example,  if  T  is  the  term 


IF  X  =  0  THEN  1  ELSE  X.F(X-l)  ,  then  its  meaning  t  is 


\f.\x  if  x  =  0  then  1  else  x.f(x-l)  ,  where  =  in  this  last  expression 
means  the  equality  function  over  the  natural  numbers,  0  the  number  0  , 
etc . 

From  now  on,  we  use  upper  case  letters  other  than  A  ,  D  ,  X  ,  F 

and  G  to  denote  (syntactic)  terms.  If  T  and  S  are  terms,  we  denote 

by  t{s/X.  }  the  result  of  replacing  all  occurrences  of  the  letter  X. 

1  i 

by  the  term  S  in  T  .  By  T(p/f]  ,  we  mean  the  term  obtained  by 
replacing  in  T  all  subterms  of  the  form  F(T1,  ...,Tn)  by 

PlTi/Xi . Tn/Xn}  •  For  example, 

if  T  =  G1(F(X1,F(X1,X2)),X1)  and  P  =  G(F(X2,X1)) 

then  T{p/f}  =  G1(G(F(G(F(X2,X1)),X1)),X1)  . 

Whenever  we  only  wish  to  substitute  P  for  some  occurrences  of  F 
in  T  ,  we  rename,  say  F^  ,  the  occurrences  that  we  shall  substitute 
and  F2  the  others.  The  result  of  the  substitutions  is  then 
TtP/F-^F/Fg}  .  The  same  kind  of  notation  also  applies  to  semantic  terms. 

We  use  F(X)  and  f(x)  as  abbreviations  for  F^,  ...,X  )  and 
f  (x^, . . . ,x^)  respectively . 

Also,  it  will  be  convenient  to  consider  only  programs  F(X)  <=  P 
where  P  is  of  the  form  G(P^,  ...,P^)  with  the  additional  restriction 
that  each  of  the  letters  F  ,  Xx, .  ..,X  occurs  at  least  once  in  P  . 

That  is,  P  is  required  not  to  ignore  any  of  its  program  variables, 
to  depend  upon  F  (i.e.,  to  be  recursive)  and  not  to  be  of  the 
uninteresting  form  F(X)  <=  F(T1,  ...,T  )  .  The  main  results  of  this 
chapter  generalize  without  this  restriction,  but  the  proofs  are  made 
longer  by  an  addition  of  special  cases. 
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1.5  Computation  Rule 

A  computation  rule  <3  is  an  algorithm  for  selecting  sane  occurrences 
of  the  letter  F  in  each  term.  For  any  such  rule  and  input  D  ,  we 

construct  the  computation  sequence  tq,T1'  *  ••  >Tn>  ‘  "  of  the  "berra  T 
hy  the  program  F(X)  <=  P  as  follows:  TQ  =  T{d/x]  and  T±+1  is  the 

result  of  substituting  P  for  the  F's  chosen  by  <3  in  T.  .  For 
example,  if  P  =  IF  X  <  2  THEN  X  ELSE  F(X-l)  +  F(X-2)  ,  the  computation 
sequence  of  F(X)  according  to  "call -by-value"  for  input  X  =  ?  is: 

Tq  =  F(2) 


T  =  IF  2  <  2  THEN  2  ELSE  F(l)  +  F(0) 

Tg  =  IF  2  <  2  THEN  2  ELSE  (IF  1  <  2  THEN  1  ELSE  F(0)  +  F(-l))  +  F(0) 

T7  =  IF  2  <  2  THEN  2 

n 

ELSE  (IF  1  <  2  THEN  1  ELSE  F(0)  +  F(-l))  + 

IF  0  <  2  THEN  0  ELSE  F(-l)  +  F(-2)  . 


T 


U 


(Here,  F(l)  is  in  fact  an  abbreviation  for  F(2-l)  ,  etc.) 

In  T  ,  we  underline  the  F's  selected  by  the  computation  rule 
for  substitution.  It  is  interesting  to  see  precisely  how  the  underlined 
F  is  selected  in  this  last  example.  For  this  purpose,  we  must  introduce 
the  notion  of  simplification.  The  simplification  mechanism  is  discussed 
at  length  in  Cadiou  [2],  and  we  refer  the  interested  reader  to  this 
work.  In  our  particular  example,  it  is  possible  to  define  a  simplifi¬ 


cation  mechanism  XT  sirapl(T)  such  that 


simpl(T0)  =  F(2) 
simpl^)  =  F(l)  +  F(0) 
simpl(Tg)  =1+  F(0) 
simpl(T^)  =  simpl(Tj+)  =  ...  =1 

(Note  that  now,  F(l)  is  no  longer  an  abbreviation  since  simpl(2-l)  -  1  .) 

The  rule  "call -by- value"  then  selects  the  leftmost -innermost 
occurrence  of  F  in  simplified  terms .  Similarly,  "call-by-name" 
selects  the  "leftmost -outermost"  one. 

In  its  most  general  form,  simplification  can  be  an  extremely 
powerful  computation  tool.  For  example,  if  our  program  is 
F(X)  <=  IF  X  =  0  THEN  0  ELSE  F(X-l)  it  is  perfectly  all  right  to  use 
F(X)  -  0  as  a  simplification  rule  over  the  natural  numbers,  and  there 
is  no  room  left  for  substitutions!  Our  purpose  however  is  to  study 
computations  which  are  performed  by  substitutions  and  not  by 
simplifications . 

We  must  therefore  restrict  the  power  of  simplifications  which  we 
allow,  and,  for  this  purpose,  we  merely  borrow  Cadiou' s  notion  of 
standard  simplifications  (see  Cadiou  [2  ]  for  a  precise  definition). 
Roughly,  standard  simplifications  force  us  to  know  everything  about 
base  functions,  and  nothing  a  priori  about  the  recursively  defined 
function  F  ,  since  simplifications  of  the  type  F(D)  -*  are  not 
permitted.  In  effect,  we  have  to  compute  without  any  "built  in"  value 
of  the  recursively  defined  function,  stored  for  example  in  memory  from 
a  previous  computation. 

We  will  not  study  standard  simplifications  in  lang  F  ,  since  this 
would  require  describing  completely  the  data-type  on  which  computations 
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are  performed  but  we  will  describe  them  in  lang  S  . 

For  all  constants  A  ,  . . . ,  A .  and  base  function  G  there 

1J-  J  P  ir 

exists  a  standard  simplification  of  the  type 


G  (A.,,  •  •  • ,  A.  )  - 

ps  ll  ip' 


A. 

J 


In  effect,  this  says  that  the  values  of  the  base-functions  over  the  domain 
are  known,  and  these  functions  are  total.  Accordingly,  the  conditional 
admits  the  simplifications 

IF  TRUE  THEN  B  ELSE  C  -  B  and 
IF  FALSE  THEN  B  ELSE  C  -  C  . 

These  are  the  only  standard  simplifications  in  lang  S  and  we  say 
that  a  term  is  simplified  when  all  of  its  subterms  have  been  simplified. 


1.4  Computation  Lattice  of  a  Program 

Instead  of  considering  computation  sequences  for  each  input  and 
computation  rule,  we  can  apprehend  the  set  of  all  possible  computations 
an  one  infinite  diagram. 

For  example,  the  computation  diagram  of  the  terra  F(F(X))  by  the 
program  F(X)  <=  G(X,F(F(X)))  looks  like 
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A  computation  rule  is  then  an  algorithm  i'or  selecting  a  path  in  such 
a  graph  for  each  input.  This  computation  diagram  has  a  very  rich 
structure  which  we  shall  now  study . 

Computation  of  a  term  according  to  P 

We  say  that  B  -  C  or  simply  B  -  C  whenever  C  can  be  obtained 
P 

by  substituting  P  for  some  occurrences  of  F  in  B  . 

The  notation  B  C  or  B  C  means  that  there  exists  a 
P 

finite  sequence  of  terms  •  •  •  , such  that  Dg  -  B  >  ~  ^ 

and  D.  -*  D . . ,  for  0<i<m. 
x  p  i+l  “ 

Definition 

The  computation  diagram  of  T  by  P  is  the  set  of  terms U — such 

that  T  *  U  ,  partially  ordered  by  <  where  B  <  C  whenever  B  -•  C  . 
-  P  -  -  V 

It  is  clear  that  <  is  reflexive  and  transitive.  In  order  to  prove 
that  it  is  also  antisymmetric,  we  notice  that,  if  B  ^  C  ,  the  size 
jjcjj  (where  size  is,  say  the  number  of  symbols)  of  the  term  C  is 
strictly  larger  than  the  size  of  B  if  at  least  one  substitution  has 
been  performed  (this  is  due  to  our  restriction  on  P  )  •  It  follows 

-X-  -X" 

that  B  -  C  and  C  -  B  implies  B  =  C  . 

Clearly,  the  computation  diagram  of  T  by  P  has  the  Church-Rosser 
property  of  the  k-calculus.  (This  follows  from  the  work  of  Rosen  [28] 
for  example.)  However,  it  also  has  a  property  which  is  not  true  of  the 
\-calculus,  namely: 


2h 


Theorem  1 


The  computation  diagram  of  T  by  P  is  a  lattice  under  the 
ordering  <  ,  and  we  shall  name  it  the  computation  lattice  of  T  by  P  . 

*/  * 

Proof,  -f  In  order  to  study  the  structure  of  the  canputation  diagram  of 

a  term  by  a  program  P  >  we  need  to  relate  the  structure  of  C  to 

* 

that  of  B  when  B  ->  C  . 

P 

Lemma  1 

(i)  A  *  C  if  and  only  if  C  =  A.  and  X .  *  C  if  and  only  if  C  =  X, 

'  ‘  i _ _ _ “ - - i_j -  J _  _ s. 

(ii)  G.  (B..,  .  •  -,B  )  -  C  if  and  only  if  C  =  . .  .,C  )  and. 

1  1  ^i  _ 

B .  *  C .  for  1  <  i  <  p .  • 

x  i _ —  —  1 

(iii)  FCB^  . .  .,Bn)  -  C  if  and  only  if  C  =  FfC^  •  ..,Cn)  with  Bi  -  Ci 
for  1  <  i  <  n  or  P{B1/X1 ,  . . .,  Bn/Xnl  “*  c  • 


Proof.  Claims  (i)  and  (ii)  are  easy  and  we  only  prove  (iii). 

If  B  =  F(B1,  ..-,Bn)  *  C  and  C  is  not  of  the  form  F(C1,  ...,Cn)  , 
there  must  be  a  point  in  the  computation  B  -  C  where  the  outermost  F 
of  B  is  substituted,  i.e.,  F(B1,...,Bn)  -  F(Bij^  •  •  •  *B^) 

P{B^/X1,...,  B||/Xn]  ->  C  with  B[  —  BV  (and  therefore  B±  -  B7  )  for 
any  1  <  i  <  n  . 

-X* 

It  follows  from  our  definitions  that  B^  -  BV  for  1  <  i  <  n 
implies  P{B1/X1 , . . . ,  B^Xj  *  PfB^/X1 , . . . ,  Bj/^}  and  consequently 

pfB  /X  i  . .  -  3  B  /X  1  r  '  ~s  claimed  in  (iii)  •  In  order  to  get  the 
1'  1  n'  nJ 


mm*  v m VW.  j  PH  i  VLR  iJi  Mill  if,  I'  41W.W. 


other  part  of  the  implication  (iii),  we  simply  notice  that 
F(B1,...,Bn)  -  P{B1,/X1 ,  . . Bn/Xnl  by  substituting  P  for  the  outer 
F  in  F(B^, . • . 

□ 


If  B  <  C  ,  we  can  define  a  distance  dist(B,C)  between  B  and  C 
as  follows : 


(i) 

(ii) 


if  B  =  A.  or  B  =  X.  then  C  =  B  and  dist(B,C)  =  0  ; 

if  B  =  Gi(B1,  ...,Bp_)  then  C  =  G.  (C^  . .  .,Cp  )  with  B± 

for  1  <  i  <  p.  and  dist(B,C)  =  max  [dist(B .  ,C  .) }  ; 

l<j<p.  J  J 


<  C. 


1 


(iii)  if  B  =  F(B1}  ...,Bn)  then  (by  Lenma  1),  either  C  =  F^, .  ,.,C  ) 

and  dist(B,C)  =  max  (dist(B.  ,C . )  }  or 
1  <i <n  1  1 

PfBp/X-L  ,  •  ••,  Bn/Xn]  <  C  and  dist(B,C)  =  1+ dist^^/X^  . . . ,Bn/Xn},C)  . 

It  is  easily  seen  that  the  distance  between  any  two  terms  B  <  C  is 
finite. 


Lemma  2 

If_  B  =  F(B^,  . .  .,Bn)  ,  C  =  F(C^,  .  .  . ,C^)  ,  B'  =  P{B^/Xp  . .  .,Bn/Xn) 
and  C'  -  PfC^  , . . Cn/Xn)  then  B  <  C  implies  B'  <  C'  and 
dist(B',C')  <  dist(B,C)  . 


Proof.  By  a  straightforward  induction  on  ||p|| 
dist(P{B1/X1  ,  . . Bn/Xn},P{C1/xi , . . Cn/Xn})  < 

hence  distfBSC')  <  dist(3,C)  . 


,  one  proves  that 

max  {dist(B.,C.) } 
l<i <n  1  1 


□ 


> 
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. . 


II  r  •  rm'afaBii 


We  now  stare  the  proof  of  Theorem  Is 

For  any  two  terms  B  ,  C  in  the  computation  diagram  of  T  by  P  , 
we  must  show  the  existence  of  min(B,C)  and  max(B,C)  such  that 

min(B,C)^^ 


max(B,C) 1 


and  for  any  Q  and  H 


Q. 


/\ 

c 

tr  » 


implies 


Q  <  min(B,C) 


max(B,C)  <  H 


Existence  of  max(B,C) 

We  shall  describe  an  algorithm  for  computing  max(B,C)  and  then 
prove  the  correctness  of  this  algorithm:  let  o(B,C)  be  defined 

recursively  as 

(i)  cr(B,B)  =  B  , 

(ii)  <J(G.  (B..,-**,B  ),G.(C.j,  ...,C  ))  =  G^(a(B^,C^), . .  .,a(B  ,Cp  ))  , 

1  ±  J-  X  X 

(iii)  o(P(B1,...,Bn),F(C1,...,Cn))  =  F(a(B-^,C^),  •••, 0(^,0^))  , 

(iv)  ff(P(B1,...,Bn),G(C1,...,Cp))  =  a(p{B1/X1,  Bn/Xn},G(C1,  ...,Cp)) 
o  (G(C^,  .  •  •  ,C^)  ,F(B^,  •  •  •  >Bn) )  , 

(v)  in  all  the  other  cases,  ff(B,C)  yields  an  error  symbol,  (say  a 
German  gothic  letter)  which  is  not  part  of  our  set  of  letters . 
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We  shall  prove  that  cr(B,C)  =  max(B,C)  in  two  parts; 


Part  1.  For  any  terms  T  ,  B  ,  C 


implies 


X 


a(B,C) 


The  proof  is  by  induction  on  couples  (dist(T,B)  +  dist(T,C),  ordered 

lexicographically  by  <:  .  Assuming  the  result  to  be  true  for  all 
triples  T'  ,  B'  ,  C'  with  <dist(T' , B* )  +  dist (T’,C • ), ||T' j| >  < 

(dist(T,B)  +  dist(T,C),ljTj|)  ,  we  prove  it  for  T  ,  B  ,  C  by  a  case 
analysis  on  the  structure  of  T  • 


Case  1.  T  =  A.,  or  T  =  X..  . 

X  J 

By  Lemma  1,  T  *  B  and  T  *  C  implies  T  =  B  and  T  =  C  ;  hence 

*  * 

B  =  C  =  a(B,C)  and  indeed  B  -*a(B,C)  and  C  -»a(B,C)  . 

Case  2.  T  =  G.(T  ,  ...,T  )  . 

1  ±  Pi 

By  Lemma  1,  B  =  G.  (B, ,  . .  .,B  )  and  C  =  G.  (C  , . .  >,C  )  >  with 

l  ±  Pi  *1 

T  *  B.  and  T.  *C.  for  1  <  i  <  p.  .  Since  dist(T.  ,B. )  +  dist(T.  ,C . )  < 
ix  11  —  —  1  11  xx 

dist(T,B)  +  dist (T,C)  and  <  ||T||  for  any  1  <  i  <  >  the 

induction  hypothesis  tells  us  that  B^  -♦  o(B^,C^)  and  C^  -*  °(B^,C^) 
for  each  1  <  i  <  p^  •  Regrouping  everything,  the  conclusion 
B  *  a(B,C)  and  C  *  a(B,C)  then  follows  from  the  definition 
a(Gi(B1,  •  ..,Bp^),G1(C1,  •••,C.p))  =  G^a^,^),  . .  .,a(B^,C^) )  . 


Case_3 .  T  -  F(T^,  •  •  >,T^)  • 


By  symmetry,  we  only  need  consider  the  subcases: 
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Case  3.1. 


B  =  F(B1,...,Bn)  and  C  »  F(C1# . .  .,Cn) 

The  proof  is  similar  to  that  of  Case  2. 

Case  3.2.  B  =  F(B^,  . .  .,Bn)  and  C  =  G(C^,  . . . ,C_^)  . 

Let  T»  =  Pfrj/X^...,  Tn/Xn)  and  B*  =  P{B1/X1 ,  . . . ,  Bn/Xn}  . 

-X-  -X- 

By  Lemma  1,  we  know  that  T1  -*  C  and  T^  -  B^  for  1  <  i  <  n  ,  hence 

T’  -*  B1  .  By  Lemma  2,  we  know  that  dist(T',B')  <  dist(T,B)  .  Since 
dist(T%C)  <  dist(T,C)  ,  we  can  apply  the  induction  hypothesis  to  the 
terms  T'  ,  B'  ,  C  ,  i.e.,  B*  *a(B',C)  and  C  *  cr(B%C)  .  Since 
B  -»  B'  and  a(B,C)  =  a(B',C)  by  definition  of  cr  ,  we  have  established 

-ft  -X- 

that  B  -*  -T(B,C)  and  C  -•  cr(B,C)  . 

Case  j.j.  B  =  G(B..,  ...,,B  )  and  C  =  G(C  ,...,C  )  . 

x'  A. 

Let  T1  =  V[T1/X1,  Tn/Xn}  .  By  Lemma  1,  we  know  that  T’  *  B 

and  T'  *C  .  Since  dist(T',C)  <  dist(T,B)  and  dist(T',C)  <  dist(T,C) 

* 

we  can  use  the  induction  hypothesis  in  order  to  get  B  -*  cr(B,C)  and 
C  0  (B,C )  . 


Part  2.  For  any  terms  B  ,  C  ,  Q, 


B 

Q 

S 

The  proof  is  by  induction  on 

Case  1.  Q,  -  A.  or  Q  =  X. 

-  i  ^  D 

Then  Q  =  B  =  C  =  c(B,C) 


implies  a(B,C)  < Q 


(dist(B^)  +  diGt(C^),||Q||> 


/  v  * 

and  o(B,C)  -  Q  . 
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Case  2. 


is  not  G  . 


Q  =  F(Q1,  ...,Qn)  or  Q  =  Gi(Q1,  )  where  G± 

The  proof  goes  mut  at  is  -mutandis  as  that  of  Part  1,  Case  2. 

Case  3»  Q,  =  G^,  •  •  *>Qp) 

We  only  need  consider  the  cases: 

Case  3-1.  B  =  G(B1,  . .  .,Bp)  and  C  =  G^,  . . . ,Cp)  . 

Back  to  Case  2. 

Case  3.2.  B  =  F(B1,  . .  .,Bn)  and  C  =  G^,  . .  .,C  )  . 

Let  B*  =  P[B,/X,  ,  •  • .,  Bn/Xn]  .  Since  dist(B»,C)  <dist(B,Q)  , 
we  know  by  the  induction  hypothesis  that  a(B',Q)  =  o(B,C)  -*  Q,  . 

Case  3 .  3«  B  =  F(B^j  .  •  • , B^)  and  C  =  F(C.^,  •  •  •  >C^)  • 

Let  B'  =  P{B^/Xp  ,  . . . ,  Bn/Xn]  and  C’  =  P[C^/X^  ,  . .  .,  cn/xn]  . 

The  induction  hypothesis  tells  us  that  ct(B,^C*)  *  Q,  .  One  then  proves 
by  induction  on  jj Pi 1  that  = 

a(Fl:B1/Xl  *  "  '*  Bn^X  n^P^Cl/Xl  *  "  '*  Cn^Xn^  ^  =  ^B1,C1^X1  a^Bn,Cn^Xn^ 

We  conclude  the  proof  by  noticing  that  cj(B,C)  -’♦cCB^C1)  since 
a (B,C)  =  F(cr  (B^,C^) ,  .  .  ^a(Bn;Cn))  -•  P{cr (B-^;C^)/X^  ,  .  .  •  ,  <7(Bn^Cn)/Xn]  - 
a(B',C*)  . 


Existence  of  min(B,C) 

For  any  terms  B  ,  C  in  the  computation  diagram  of  T  by  P  the 
set  {L  |  L  <  B  ,  L  <  C }  of  lower  bounds  of  B  and  C  is  not  empty 
because  T  <  B  and  T  <  C  and  it  is  finite.  We  know  from  elementary 
lattice  theory  that,  if  any  two  elements  in  a  partially  ordered  set  have 
a  least-upper-bound,  any  non-empty  finite  subset  also  has  a  least-upper- 


bound.  We  then  define  min(B,C)  as  max{L  j  L  <  B  ,  L  <  C}  and  verify 
easily  that  min  has  all  the  desired  properties. 

□ 


Relation  Between  the  Computation  Lattice  and  the  Data-type  of  Continuous 
Functions  over  £ 

In  order  to  characterize  computed  partial  functions  in  terms  of  the 
semantic  interpretation  of  a  given  computation  lattice,  we  notice  that 

Lemma  C 

For  any  terms  B  ,  C  in  the  computation  lattice  of  T  by  P  , 

B  <  C  implies  b(Q)  c=  c(q)  . 

Proof.  The  proof  is  straightforward  by  induction  on  ||b||  : 

If  B  =  A.  or  B  =  Xj  then  B  =  C  and  b(Q)  =  c(q)  . 

If  B  =  G.  (B, , . . .  ,B  )  ,  then  C  =  G.(Cn ,  . .  .,C  )  and  we  know  by 
1  pi  1  Pi 

induction  that  b.(Q)  c  c  .(q)  for  1  <  j  <  p.  .  Since 

t)  tj  1 

[\x^  ...,x  ,g.  (x.^,  ...,x  )]  is  monotone  with  respect  to  any  of  its 

Pi  1 

arguments,  b(o)  h  g^b-^Q), . .  .,b  (q))  c  g1(c1(n) , . .  .,c  (fi))  h  c (fi) . 

pi  pi 

Finally,  if  B  =  F(B,  ,...,B  )  then  b(Q)  s  qc  c(q)  . 

X  n  r— 1 


In  particular,  to  any  computation  sequence  TQ  -•  -•  . . .  T  -*  Tn+p 


according  to  some  rule  C,  and  input  D  ,  we  associate  the  chain 

*0^)  EE  —  ' ' '  EE  ^n(ft)  (^)  E  ^n+l(^)  ' 


The  corresponding  computed  partial  function  is  therefore 


characterized  as:  Q,  =  \d  J  t  (Q)(d)  . 

-  p  _  nv  ' '  ' 

*  n  >0 
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From  these  definitions  follows  an  easy  generalization  of  a  theorem 
of  Cadiou  [2  ] : 

Theorem  2  (Cadiou) 

Any  fixed-point  of  the  equation  f  s  p(f)  is  an  extension  of  any 
function  computed  by  the  program  F  <=  P  . 

w  Q  B 

Proof.  For  any  natural  number  m  ,  let  I*  be  defined  as  P  =  F(X) 

and  f*1  =  p[i*7f}  .  It  is  easily  seen  that  p^(Q)  =  p(p( . . .p(Q) . . .)) 

(i  times).  Since  Cadiou  [2]  proved  that  for  any  computation  sequence 

T  ,Ti,...,T  where  TQ  =  F(X)  we  have  Ti  <  P1  for  all  natural 

numbers  i  ,  it  follows  from  Lemma  C  that  t^(Q)  c  p1(Q)  for  all  i  . 

The  function  p  being  continuous,  f  =  U  pi(Q)  ,  hence  t  (n)  c  f 

P  i  >0  P 

i'cr  any  i  .  It  follows  that  C,  =  U  t .  (Q)  C  f  and,  since  f  c  f 

p  i >0  1  p  P  ~ 

for  any  fixed-point  f  of  p  ,  the  conclusion  &  c  f  holds. 

□ 


2. 


Correct  Implementation  of  Recursion 

In  this  section,  we  try  to  characterize  the  computation  rules  (l 
such  that  ^  =  fp  for  any  program  F  <=  P  ,  called  fixed-point 
computation  rules. 

Here  are  some  computation  rules  we  shall  consider,  both  in  lang  S 
and  lang  F  : 

(1)  Call  by  value:  substitute  for  the  leftmost -innermost  occurrence 

of  F  after  simplifications. 

(2)  Call  by  name:  substitute  for  the  leftmost-outermost  occurrence 

of  F  after  simplifications. 

(3)  Parallel  innermost:  substitute  for  the  occurrences  of  F  having 

all  of  their  arguments  free  of  F's  . 

(h)  Parallel  outermost;  substitute  for  all  the  F's  which  do  not 
occur  in  any  argument  of  another  F  . 

(5)  Free  argument:  substitute  for  all  the  occurrences  of  F  having 

at  least  one  of  their  arguments  free  of  F's  after  simplifications 

(6)  Full  substitution:  substitute  for  all  the  occurrences  of  F  . 


2.1  Incorrect  Computation  Rules 
Proposition  1. 

In  lang  P  ,  the  rules  (l),  (2),  (3)  and  (5)  are  incorrect. 

Proof.  Consider  the  program  F(X,Y)  <=  IF  X  =  0  THEN  0  ELSE 

F(X+1,F(X,Y)  )*F(X-1,F(X,Y) )  where  *  is  the  parallel  multiplication 
function  0*x  =  x*0  =  0  .  The  least  fixed-point  over  the  integers 


(considered  as  a  discrete  data-type)  of  the  corresponding  functional 
is  the  zero  function  kx.,y  if  x  =  uj  then  uu  else  0  .  The  computation 


of  F(1,0)  using  (1),  (2)  or  (3)  is  infinite.  As  for  rule  (5)>  we 
can  take  the  program  F(X)  <=  X.F(F(X))  in  the  data-type  of  sequences 
of  letters  as  a  counter-example.  p 

Proposition  2  (Morris  [23]) 

In  lang  S  the  rules  (l)  and  (3)  are  incorrect. 

Proof.  Consider  F(X,Y)  <=  IF  X  =  0  THEN  0  ELSE  F(X-1,F(X,Y))  .  The 
corresponding  least  fixed-point  over  the  non-negative  integers  is  again 
the  constant  function  0  while  the  computation  of  F(1,0)  using  rules 
(1)  or  (3)  is  infinite.  ,  _ 


2.2  Safe  Computation  Rules 

We  now  define  the  class  of  safe  computation  rules,  and  show  that 
they  correspond  to  "correct"  implementations  of  recursion. 

Let  3  be  a  computation  rule  and  B  an  arbitrary  term  in  the 
computation  lattice  of  T  by  P  .  In  order  to  describe  the  effect 
of  Q,  on  B  ,  we  rename  F^  the  occurrences  of  F  selected  for 
substitution  by  Q,  in  B  for  some  input  D  ,  and  F2  the  others. 

Definition 

We  say  that  <3  is  a  safe  computation  rule  if,  for  any  term 
B[F/F^,  !•’/ P'2 }  in  ohe  computation  lattice  of  T  by  P  and  for  any 

inPut  D  2.  btcVfrfp/f2}(a)  H  b[£Vf1,c^f2}(d)  . 


% 


Intuitively,  the  computation  is  safe  if  the  values  of  the  F's 
which  are  notsubstituted  (renamed  Fg  )  are  insufficient:  as  long  as 
more  information  is  not  obtained  about  the  other  arguments  (the  F-^s  ) 
the  information  about  B  cannot  be  improved. 

In  order  to  clarify  this  definition,  let  us  prove  the  safeness  of 
some  of  our  computation  rules. 

Proposition  3 

In  lang  S  ,  the  rules  (2),  i.e.,  call -by -name  and  ($),  i.e., 
free  argument  are  safe. 

Proof.  By  induction  on  |]c||  where  C  =  simpl(B)  :  we  first  notice 

that,  because  of  the  semantic  definition  of  lang  S  ,  if  F  occurs 

in  C  then  c(Q)(d)  s  w  (remember  that  C  has  been  simplified  and, 

when  a  simplified  term  has  the  form  IF  C,  THEN  Cn  ELSE  C,  ,  we  must 

±  d  2 

have  F  occurring  in  C  ) . 

Case  C  =  then  any  rule  is  safe. 

Case  C  =  Gi(C1,  ...,C  )  .  The  letter  F  occurs  necessarily  in  C  , 

_ pi 

otherwise  we  could  simplify  further.  Since  both  rules  select  at  least 
one  F  on  such  terms,  we  know  by  our  previous  remark  that 

>  fp/f2}(d)  *  <u  =  c{^fx  ,  Q/f2}(d)  . 

Case  C  =  F(C1, ...,C  )  .  The  safeness  of  rule  (2)  is  straightforward 

since  the  outermost  F  is  substituted.  For  the  same  reason,  rule  (5) 

is  safe  if  at  least  one  of  the  C.  is  constant.  If  none  of  the  C  rs 

i  l 

is  constant,  then  c^{cyf^  ,  f^/fg}(d)  =  co  for  1  <  i  <  n  and  we  must 
prove  that  f  (uu,  . .  .  ,uo)  s  uj  .  This  is  ensured  by  imposing  in  lang  S 
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that  all  program  variables  X^,  ...,Xn  occur  in  simpl(P)  hence 
f  ,tt>)  S  p(fp)(u>,  S  0)  •  a 

Proposition  U 

The  rules  (4),  i.e.,  parallel  outermost  and  (6),  i.e.,  full 
substitution  are  safe  in  both  lang  S  and,  lang  P  • 

Proof.  By  induction  on  \\T}\\  . 

Case  B  =  A  .  Any  rule  is  safe. 

Case  B  =  Gi(B1, . .  .,Bp_)  .  By  induction,  ,  f^/f^d)  = 

b.tq/^,  Q /fQ}(d)  for  1  <  i  <  P  in  both  cases,  hence  safeness  is 
also  satisfied  on  b  . 

Case  B  =  F(B1,  . .  .,Bn)  .  Both  rules  select  the  outermost  F  hence 
bfo/fi ,  yf2}(d)  =  w  s  bln/^ ,  q/f2}(d)  .  _ 


Note  that  the  computation  rules  that  we  already  recognized  as 
incorrect  are  all  unsafe.  In  order  to  prove  that  safe  rules  are 
correct,  we  need  the  following  technical  lemma; 


Lemma  S 

If  Q,  is  safe,  then  B  ^  C  and  min(B,Q)  =  min(C,Q)  ;ijnply 
q(Q)(d)  cb(Q)(d)  for  any  terms  B  ,  C  and  Q,  in  the  computation 
lattice  of  T  by  P  ,  and  input  D  . 


36 


uuUiiiaiiiiiiHiittfii 


Proof.  Let  us  first  determine  some  properties  of  the  min  of  two 


terms : 


Lemma  3 


(i) 


(ii) 


min(G^(B-^,  . 
min(P{BI/X1 


..,B  ),Gi(C1,...,C  ))  =  G1(mln(B1,C1),  ...,min(Bp  , 

pi  vi.  i 

>  •  •  • )  Bn/Xn),G(C^,  •  •  «,Cp)  )  =  P[M1/X1  >  . .  .  ,  Mn/Xn] 


C  )) 

Pi 


where  M,  >  . . .  ,Mn  are  such  that 

F(M1,  .  ..,Mn)  =  min(F(B1>  .  ..,Bn),G(C1>  . .  .,Cp))  . 


Proof.  Property  (i)  is  easy  and  property  (ii)  follows  from  the  fact 

y  j| 

that  P(M1/X1 ,  . . .,  Mn/Xn]  -*  M'  -  P[B1/X1,...,Bn/Xh}  with  Mj.  - 
for  1  <  i  <  n  implies  that  M'  =  P{Mj_/X1 ,  . . .,  M^}  where 

'X-  -X* 

M.  -  M!  -•  B.  for  1  <  i  <  n  . 

lix--  D 


We  now  prove  Lemma  S:  Let  us  rename  F1  the  occurrences  of  F 

selected  by  <3  in  B  and  F 2  the  others.  Let  M  =  min(B, Q,)  =  min(C,Q) 
We  first  prove  by  induction  on  (dist(M, B)  +  dist(M,C) ,  that 

q  <  b{f/F^  ,  i^n/F2]  for  some  natural  number  m  .  (Here  I^1  means 
Pf^/F)  for  m  >  0  and  P°  =  F(X1# . . .  .Xn)  .) 

Case  M  =  A.  or  M  =  X . 

1 _ A 

In  this  case,  M  =  B  =  C  =  Q  and  we  can  choose  m  =  0  . 

Case  M  =  G.  (M, , • • -,M  ) 

_ 1  _ Pi 

By  Lemma  1,  B  =  G.(B,  ,...,B  )  ,  C  =  G.  (C  _,...^C  )  and 

1  J-  p^  X  J- 

Q  =  G  (0.,  ...,Q  )  •  By  Lemma  3,  II  =  niin(B^,Q^)  =  min(Ci,Q^)  for 

Pi 


1  <  i  <  p  .  It  follows  by  induction  that  <  B^{f/F^  ,  pmi/Fg3  . 

We  can  then  choose  m  =  sup  (m. }  in  order  to  get 

1<i<Pi  1 

Q  <  B{F/F-L  ,  F^/F2}  . 


Case  M  =  F(MX,  .  .  .,Mn) 

By  definition  of  min  ,  we  need  only  consider  the  cases: 


!ase  B  -  G(B-^,  •  •  •  ,B  )  and  0,  -  F(Q^>  •  •  *>Q,n) 


Let  M*  =  P{M1/X1,...,  Mn/Xn}  and 
Q'  =  P{Q1/X1 ,  •••^Qn/Xn3  •  By  Lemma  3, 

M*  =  min(B,  ' )  =  min(C,Q/)  .  By  Lemma  2, 
dist(M',B)  +  dist(M*,Q, ' )  <  dist(M,B)  +  dist(M,Q) 
so  we  know  by  induction  that 
Q'  <  BfF/F-^  ,  I^/Fg]  and,  a  fortiori 
Q  <  B{F/F1,  I^/F2}  for  some  m  . 


Case  B  —  F(B-^>  •  • . ,B^)  and  Q  -  G(Q-^,  . .  .,Q  ) 


Since  min(B,Q)  =  min(C,Q)  ,  the  term  C  is  also 
of  the  form  C  =  F(C-^,  ...,C  )  .  Let 

M'  =  PlVXl ’  *  *  *'  VXn}  ’  B'  =  P^VX1 '  *  •  **  Cn/Xn5 

and  C'=  P{C^/X^  , . . Cn/Xn3  •  By  Lemma  3>  we 
know  that  M*  =  min(B*,Q)  =  min(C’,Q)  . 
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By  Lemma  2,  dist(M* ,B' )  +  dist(M' , Q,)  <  dist(M,B)  +  dist(M,Q)  * 
and  the  induction  hypothesis  tells  us  that  Q  <  B'  {f/F^  ,  * 

Since  the  outermost  F  has  not  been  selected  by  (J  in  B  then 
B'  <  B{p/F2]  .  Our  last  case  is  then  treated  since 
Q  <  B(F/F1 ,  • 

It  is  now  easy  to  finish  the  proof  of  Lemma  S. 

For  any  m  *  pm(Q)  E  f^  implies  b{Q/f^  *  p  (n)/f2]  E  * 

By  choosing  m  large  enough*  we  know  that  q ( Cl)  E  >  P  (o)/^2j 

and  therefore  q(Q)  cb{cyf^*  fp/f2}  *  ®^nce  &  -*-s  sa^e> 

*  f^/f2}(d)  =  b(Q)  (d)  and  the  conclusion  q(Q)(d)  Cb(n)(d) 

follows . 

Theorem  3 

Any  safe  rule  is  a  fixed -point  rule. 

Proof.  In  the  computation  lattice  of  TQ  =  F(d)  by  P  *  let 

Vri,,,',Tn'  and  SQ,S1,  ...,Sn,  ...  (where  SQ=T0)  be  the  computation 

sequences  corresponding  to  respectively  some  safe  rule  (3  and  the 

fall  substitution  rule.  Since  sn(Q)  =  pn(n)  then 

U  s  (n)  =  U  pn(Q)  =  f  •  We  know  by  Theorem  2  that  <3p(d)  c  fp(d) 
n  >0  n  n  >0  P 

and  it  is  therefore  sufficient  to  show  that  U  s  (Q)(d)  c  U  tn(fi)(d)  , 

n>0  n>0 

in  order  to  prove  <3  =  fp  • 

Let  S  be  an  arbitrary  term  in  Sn*S-,*...  •  Since  there  are  only 
n  ox 

finitely  many  minorants  of  S^  in  the  computation  lattice*  there  exists 
some  m  such  that  min(Tm*Sn)  =  min(Tm)  .  The  rule  <3  being  safe, 

it  follows  from  Lemma  S  that  sn(Q)(5)  *  ^ence 


□ 


U  S  (o)(d)  c  u  t  (n)(d)  . 
n>0  m>0 

As  a  corollary,  rules  (2)  and  (5)  are  fixed -point  in  lang  S  and 
rules  (4)  and  (6)  are  fixed-point  rules  in  both  laug  S  and  lang  P  . 
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5.  An  Optimal  Implementation  of  Recursion  in  lang  S 

Among  the  correct  implementations  of  recursion,  we  now  try  to 
determine  which  ones  are  efficient.  This  proves  unsuccessful  in 
lang  P  ,  but  we  shall  describe  an  implementation  of  recursion  for 
lang  S  which  turns  out  to  be  optimal. 

We  already  know  that,  in  lang  S  ,  "call -by-name"  is  a  fixed-point 
rule,  while  "call-by-value"  is  not.  However,  "call-by-name"  is  not  an 
efficient  way  of  computing.  For  example,  in  the  program 
F(X)  <=  IF  X  >  0  THEN  X-l  ELSE  F(F(X+2))  the  "call -by-name"  computation 
of  F(0)  would  be  F(0)  -  F(F(2))  -  IF  F(2)  >  0  THEN  F(2) -1  ELSE 
F(F(F(2)+l) )  -  F(2) -1  -  0  . 

What  happens  here  is  that  the  term  F(2)  has  been  duplicated  and 
subsequently  computed  twice.  We  shall  describe  a  computation  mechanism, 
called  the  delay-rule,  which  avoids  those  duplications,  and  prove  its 
optimality . 


5.1  Never  Do  Today  What  You  Can  Put  Off  Until  Tomorrow 

A  natural  way  to  keep  track  of  duplications  of  terms  is  to  assign 
labels  to  all  occurrences  of  F  in  a  computation  sequence,  so  that 
copies  of  the  same  F  will  receive  the  same  label.  This  can  be 
achieved  by  first  labelling  differently  all  F^  in  P  j  then, 
if  F  is  labelled  a  in  Tn  and  is  to  be  substituted,  we  label  each 
occurrence  of  F  after  substitution  by  a  followed  by  whatever 
labelling  this  particular  occurrence  had  in  P  .  For  example,  using 
the  same  computation  as  before,  and  the  labelling 

IF  X  >  0  THEN  X-l  ELSE  F1(F2(X+2))  for  P  ,  the  previous  computation 
can  be  described  as: 

hi 


F(0)  -F2(F2(2))  -  IF  F2(2)  >0  THEN  F2(2)-l  ELSE  F^t F2(2)+2) 

-  IF  1  >  0  THEN  F2(2)-l  ELSE  fi;lF12(F2(2)+2) 
simplifies  to  P^(2)-l  -0  • 

The  whole  idea  of  the  delay-rule  is  to  modify  "call-by-name"  so 
that,  whenever  some  occurrence  of  F  is  substituted,  all  the  occurrences 
having  the  same  label  will  also  be  substituted.  Hence,  the  "delay-rule" 
selects  for  substitution  the  It  "-most-outermost  F  in  a  simplified 
term,  as  well  as  all  the  other  I  ‘  s  having  the  same  label. 

Consequently,  the  delay  rule  computation  of  F(0)  in  the  program 
above  is 

FpCO)  -F1(F2(2))  -  IF  Fg(2)  >0  THEN  F2(2)-l  ELSE  f11F12(F2(2)+2) 

-  IF  1  >  0  THEN  1-1  ELSE  F^F^H^)) 

simplifies  to  0  .  At  this  point,  it  is  clear  that  the  "delay  rule"  is 
safe  (proof  similar  to  that  of  Proposition  1);  what  is  not  clear  is  that 
the  "delay  rule"  should  be  more  efficient  than  "call-by-name"  and  in  fact, 
in  our  last  example,  it  was  less  efficient  since  it  took  four  substitutions 
versus  three  for  "call-by-name"  in  order  to  obtain  its  result.  When 

"call-by-name"  computed  Fi;l(2)  twice,  the  delay  rule  has  been  computing 

it  three  times'.  It  is  a  simple  exercise  in  data  structuring  however  to 
avoid  all  those  recomputations:  instead  of  actually  copying  various 
occurrences  of  some  F^  in  a  term,  we  simply  set  some  pointers  to  a 

unique  copy  of  the  term  F  .  Whenever  any  occurrence  of  F_  is  chosen 

uc  (X 

for  substitution,  the  substitution  is  actually  performed  in  the  unique 
copy  of  Fa  so  that  all  occurrences  of  F  are  substituted  at  the 
price  of  one  substitution. 


h2 


Going  a  little  bit  away  from  our  particular  programming  language 


we  can  sketch  an  implementation  of  this  idea  for,  say  Algol.  The 
arguments  of  any  procedure  should  be  stored  as  pointers  to  formal 
expressions,  together  with  a  tag  indicating  that  those  arguments  have 
not  yet  been  computed.  Whenever  the  value  of  an  argument  is  explicitly 
needed,  (for  the  evaluation  of  a  conditional  or  on  the  right-hand  side 
of  an  assignment),  the  tag  is  tested.  If  the  value  of  the  parameter  is 
already  there,  we  use  it;  otherwise  the  corresponding  formal  expression 
must  be  computed,  its  value  kept  for  further  references,  and  the  tag 
is  to  be  changed.  In  a  machine  like  the  Burroughs  B5000  (see,  for 
example,  Lonergan-King  [12]),  the  so-called  "operand  call  syllable" 
would  do  very  nicely:  depending  on  a  tag  stored  with  the  operand,  a 
load  operation  on  the  B5000  gets  its  argument  either  directly  or  through 
a  subroutine  call.  The  delay  rule  would  modify  this  procedure  so  that, 
after  the  subroutine  call,  the  result  would  be  stored  in  place  of  the 
tagged  subroutine  descriptor.  Of  course,  one  would  then  have  to  abandon 
"side-effects"  altogether'. 

Before  proving  the  optimality  of  the  delay  rule  let  us  compare  the 
efficiency  of  various  computation  rules  on  the  programs 
Zer(X)  <=  IF  X  >  0  THEN  X-l  ELSE  Zer(Zer(X+2) ) 

Ack(X,Y)  <=  IF  X  =  0  THEN  Y+l 

ELSE  IF  Y  =  0  THEN  Ack(X-l,l) 

ELSE  Ack(X-l,Ack(X,Y-l)) 

Ble(X,Y)  <=  IF  X  =  0  THEN  1  ELSE  Ble(X-l,Ble(X-Y,Y) ) 

Fib(X)  <=  IF  X  <  2  THEN  X  ELSE  Fib(X-l) +  Fib(X-2) 
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over  the  integers. 


Zer(-2) 

Ack(2, 1) 

Ble(8,2) 

Fib(5) 

Delay  rule 

7 

l4 

9 

15 

Call  by  name 

25 

29 

9 

15 

Call  by  value 

7 

lb 

3kl 

15 

Free  argument 

7 

23 

~  4000 

15 

*/ 

Full  substitution—' 

11 

23 

~  10000 

15 

The  entries  in  this  array  indicate  the  number  of  substitutions 
required  for  computing  the  values  at  the  top  of  the  corresponding 
column,  according  to  the  rules  at  the  left  of  the  rows. 

If  he  has  been  through  those  examples,  the  reader  may  feel  quite 
disappointed  because  he  can  beat  the  delay-rule  in  almost  all  cases. 
For  example,  the  hand -computation  of  Fib  (5)  only  requires  five 
substitutions  if  we  are  careful  never  to  recompute  an  argument  twice. 
It  would  be  interesting  to  study  a  mechanism  in  which  this  type  of 
computation  would  be  possible;  namely  one  could  imagine  a  set  of 
simplification  rules  which  could  be  augmented  dynamically,  and  allow 
some  computations  to  be  performed  by  simplifications  of  the  style 
F(D)  -  A  .  In  our  scheme  of  things,  however,  this  type  of  "built-in" 
values  is  not  possible,  since  our  only  means  of  computation  is  through 
substitutions,  and  we  should  blame  inefficiencies  on  the  program,  not 
on  the  computation  rule. 

tJ 


Strictly  speaking,  we  are  using  the  full  substitution  only  on 
simplified  terms,  otherwise  the  computation  would  always  be 
infinite. 


3.2  Optimality  of  the  Delay  Rule 


So  far,  we  know  that  the  delay  rule  is  safe,  and  that  it  never 
recomputes  copies  of  the  same  term.  Using  the  same  labelling  as  before, 
we  say  that  a  label  FQ  is  maximal  in  a  term  if  a  is  not  a  proper 
initial  segment  of  £  for  any  label  F^  in  the  term.  A  term  is  simple 
if  all  of  its  labels  are  maximal.  In  other  words,  a  term  is  simple  if 
all  computations  of  various  copies  of  subterms  have  been  pushed  to  the 
same  point.  For  example,  if  Tq  =  F(F(X))  and  Tq  =  G(X,F^(Fg(X) )) 
then  G(G(X,F1(F2(X)),F1(F2(F(X)))))  is  not  simple  while 
F(G(X,F1(F2(X))))  is  simple. 

A  computation  is  simple  if  all  F's  with  the  same  labels  are  all 
treated  alike  in  all  substitutions  (if  one  of  them  is  to  be  substituted, 
all  of  them  are  to  be  substituted)  .  All  terms  in  a  simple  computation 
are  necessarily  simple.  If  we  are  to  count  for  one  a  substitution  of 
all  F's  with  the  same  labels,  as  justified  by  our  previous  exercise 
in  data  structuring,  simple  computations  are  more  efficient  than  others. 
Namely,  if  we  define  length(TQ  -  A)  as  the  total  number  of  substitutions 
performed  during  the  computation  TQ  -*  A  ,  we  have 

Lemma  E 

For  any  term  A  ,  there  exists  a  simple  term  A  with  A  <  A  such 

■X-  -X-  — 

that,  for  any  computation  Tq  -*  A  and  simple  computation  Tq  =>  A  , 

•X-  —  '-X- 

length(T0  =>  A)  <  length(T0  -*  A)  . 

Proof.  Let  r(C)  be  the  number  of  maximal  labels  and  s(C)  be  the 
sum  of  the  lengths  of  the  maximal  labels  in  a  term  C  ,  while  q  and  p 
mean  respectively  the  number  of  occurrences  of  F  in  TQ  and  P  .  It 


- .  .  ■ . 


is  easily  proven  by  induction  on  length(TQ  -♦  C)  that 


length (T  ^  C)  >  <p(C,p,q)  where  tp(C,p,q)  =  if  p  =  1  then  else 

u  -  q 

r(C)  -q  # 

•  In  a  similar  way,  (C  simple)  and  (Tq  =>  C  simple)  imply 
length (Tq  %  C)  =  <p(C,p,q)  . 

Given  any  term  A  ,  we  can  "complete"  it  into  an  A  by  substituting 
P  for  all  occurrences  of  F  with  non-maximal  labels  until  there  is  none 
left.  An  A  constructed  in  this  way  will  be  simple  and  such  that 

A  <  A  while  r(A)  =  r(A)  .  It  follows  that,  for  any  computation 

*  *  -  *  _  - 

Tq  -*  A  and  simple  computation  TQ  =»  A  ,  length(TQ  =>  A)  =  <p(A,p,q)  = 

-X- 

<P(A,p,q)  <  length (Tq  -  A)  . 


The  intuitive  meaning  of  this  lemma  is  very  simple:  nothing  is  to 
be  gained  by  working  on  individual  copies  of  the  same  term.  At  the  same 
price,  we  get  more  information  by  substituting  all  copies  of  the  same 
occurrences .  In  particular,  all  the  computation  rules  described  so  far 
will  be  improved  by  "lumping"  together  occurrences  of  F  with  the  same 
labels,  thus  becoming  simple  rules.  However  they  may  still  perform 
unnecessary  substitutions  unless 

Theorem  k 

Any  computation  rule  which  is  simple,  safe  and  performs  at  most 
one  substitution  at  each  computation  step  is  optimal. 

Proof .  Let  Tq  be  a  term,  F(X)  <=  P  a  program  and  Q,  a  safe  and 
simple  computation  rule  performing  only  one  substitution  at  a  time. 

Let  Tq  =>  T^  =>  . . .  =*  Tn  =*  Tn+1  =»  •  •  •  the  (simple)  computation 
sequence  of  Tq  according  to  /J.  for  some  input  D  . 


If  T  is  a  term  in  the  computation  lattice  of  by  P  ,  let  us 

■X- 

consider  an  arbitrary  computation  TQ  -  T  ,  and  prove  that  whatever 
approximation  t(Q)(d)  of  tQ(f^)(d)  is  computed  by  T 

will  be  computed  faster  by  <3  •  For  this  purpose,  we  construct 

_  *  . 

T  as  in  Lemma  E,  and  consider  a  simple  computation  T^  =$  T 

(the  argument  in  Lemma  E  not  only  proves  the  existence  of  T  but  also 

*  - 

that  of  a  simple  computation  Tq  =»  T  ) . 

Let  i  be  some  natural  number  such  that  1\  <  T  and  T^+^  jk.  T  . 
Since  <3  performs  only  one  substitution  at  the  time,  this  implies 
Tj_  =  =  min(Qh  ,T)  .  By  Lemma  S,  we  then  know  that 

t(Q)(d)  c  t±(n)(d)  •  Using  Lemmas  E  and  C  now,  T  <  T  implies 

t(Q)(d)  c  t(Q)(d)  and  length(TQ  =>  T)  <  length(TQ  =>  T)  .  Since  both 

*  -  *  _ 

TQ  =>  T  and  T  ^  T.  are  simple  and  11  <  T  ,  we  have 

length (Tq  S  <  length(TQ  *  T)  hence  t(n)(d)  c  t±(Cl)  (d)  while 

length (T  *  T.)  <  length(T  *  T)  . 

0  i  -  0  q 

We  shall  derive  two  applications  of  this  theorem. 

Corollary  1 

The  delay  rule  is  optimal  in  lang  S  . 

Proof.  The  delay  rule  has  all  the  properties  required  by  Theorem  b. 

□ 


Corollary  2 

In  lang  S  ,  "call  by  value”  is  optimal  whenever  the  least  fixed- 
point  f_  corresponding  to  the  program  _  F(X)  <=  P  is  a  strict  function. 

(The  function  f  is  strict  if  f  (...,ou, ...)  =  u>  .) 

P  P 

bl 


Proof .  Since  'call  by  value"  is  clearly  a  simple  rule  and.  performs 
at  most  one  substitution  at  each  step,  we  only  need  proving  that  it  is 
safe  whenever  f  is  strict.  We  prove  that  the  substitution  B  -*  B? 
is  safe  in  that  case  by  induction  on  |jc||  where  C  =  simpl(B)  : 

Case  C  =  Ai  .  Any  rule  is  safe. 

Case  C  =  G^(C1,  •  ..,C  )  •  Same  argument  as  for  the  safeness  of 

_ _ _ pi 

"call  by  name" . 

Case  C  =  F(C1,  ...,Cn)  .  If  F  does  not  occur  in  any  of  the  C^s  , 
then  the  outermost  substitution  is  performed,  which  is  clearly  safe. 
Otherwise,  let  C^  be  the  leftmost  term  in  which  F  occurs.  Then, 

ci^fij yy(a) =  «  ^  c^f^y^Kd) ■  fp(. 5lu£ 

c(fVf1,q/f2}(d)  . 

□ 


3-3  Sequential  Functions 

The  applications  of  Theorem  b  given  in  the  previous  section  do  not 
quite  match  with  the  generality  of  the  result.  In  particular,  the  data¬ 
type  on  which  lang  S  is  computing  has  no  chain  of  length  more  than  two. 
What  we  shall  now  sketch  is  a  theory  of  sequential  functions,  where 
Theorem  4  finds  its  full  application. 

The  relevant  notion  here  seems  to  be 

Definition 

A  function  •  •  mt xn’ s(x-^j  •  •  •  > xn)  in  [D^x  . . .  xD^  -*  D  ]  is 

sequential  if,  for  all  x^D^  . . . , xneDn  there  exists  an  ie[l,n]  such 


U8 


that,  for  all  y y„  such  that  x.  c  y.  for  je[l,n]  and 
1  J.  n  J  J 

xi  =  yi  we  have  g(x1# . .  .,xn)  =  g(yx,  .  •  -,yn)  • 

Intuitively,  g  is  sequential  if,  at  any  given  moment,  the  value 
of  (at  least)  one  of  its  arguments  is  crucially  needed  in  order  to  better 
approximate  the  value  of  the  result.  For  the  purpose  of  our  theory,  we 
need  to  check  that  sequentiality  has  the  correct  closure  property, 
namely 


Proposition  S 

Sequentiality  is  preserved  by  composition  of  functions  and 
fixed-point  operators . 


Proof. 


—  Composition.  If  \z^, . .  .,zng(z^,  . . .,  zn)  and  4x^, . . .,xmf^(x^, . . .,Xm) 
for  1  <  i  <  n  are  sequential,  then 

cp  =  ^x1,...,xmg(f1(x1,...,xm),...,fm(x1,...,xm)) 

is  also  sequential:  for  any  x^,  and  ie[l,n]  ,  let 

z.  =  f , (x, , • • . ,x  )  :  since  g  is  sequential  z,...,z  determines 
i  1  n  u  n 

some  i_e[l,n]  and,  f.  being  also  sequential,  xV,,-,Xni  determine 

0  X0 

some  je[l,m]  which  can  then  be  used  for  the  sequentiality  of  cp  . 

—  Fixed-point  operator .  If  the  functions  4x^, . .  .,xnf_^(x^,  . .  .,x^) 
are  sequential  for  any  natural  number  i  ,  the  function 

cp  —  \Xl,...,xn  U  fi(x1,--*,xn) 

i  >0 

is  also  sequential:  for  any  ...,xn  sequentiality  of  the  f^'s 
determines  a  sequence  ...  where  j^e[l,n]  .  At  least  one  of 
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the  j . ' s  must  occur  infinitely  often  in  this  sequence,  and  it  can  be 
used  for  proving  that  cp  is  sequential. 

□ 


For  example,  over  a  discrete  data-type,  conditional  and  strict 
functions  are  sequential;  hence,  by  Proposition  S,  all  functions 
definable  in  lang  S  are  sequential. 

In  a  data-type  which  is  a  lattice,  the  functions  hx,y  sup(x,y) 
and  \x,y  inf(x,y)  are  not  sequential  in  general. 

The  set  if  of  finite  or  infinite  words  over  some  vocabulary  £ 
becomes  a  data-type  under  the  partial  ordering:  x  c  y  whenever  x 
is  an  initial  segment  of  y  . 

In  if  ,  the  functions 

Xx.first(x)  (take  the  first  letter  of  x  ), 

\x.rest(x)  (erase  the  first  letter  of  x  ), 

and  kx,y.x©y  (append  the  first  letter  of  x  to  y  )  are 

-*/ 

sequential.-' 

This  is  clear  enough  for  first  and  rest  since  any  function  of  one  argu¬ 
ment  is  sequential.  For  x©y  ,  if  x  =  A  ,  i.e.,  x  is  the  empty  word,  then 
the  first  argument  is  to  be  chosen  for  sequentiality  since  A©y  =  u)  ; 
otherwise,  x  f  A  and  any  x>  such  that  x  c  x»  will  have  the  same  first 
letter  so  that  we  can  use  the  other  argument  y  for  sequentiality. 

—  Yet  another  programming  language.  We  define  a  new  language  lang  GS 
similar  to  our  previous  ones  except  that  all  base  functions  must  be 
sequential. 

tf  The  relevance  of  these  functions  and  data-type  to  parallel  programs 
is  shown  in  Kahn  [11]. 
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Let  &  be  a  computation  rule,  called  the  generalized  delay  rule 
(GDR)  defined  as  follows: 

First,  using  the  same  type  of  data -structuring  as  for  the  delay 
mile,  £  will  be  simple. 

In  any  term  T  ,  rule  £  will  select  at  most  one  F  (or  rather 
set  of  F's  with  the  same  labels)  as  follows: 

If  T  =  ,  no  F  is  chosen. 

If  T  =  G.  (T,,...,T  )  ,  the  F  will  be  the  F  chosen  by  £ 

1  1 

in  T  where  ,i  is  the  index  corresponding  to  the  sequentiality 
J 

of  g.  with  the  arguments  t..  (q)  (d) , . .  .,t  (fi)(d)  .  Of  course, 

i  pi 

this  requires  the  choice  of  j  to  be  effective;  also,  since  we 
want  £  to  be  simple,  all  Ffs  with  the  same  labels  occurring 
in  other  subterms  are  also  to  be  substituted. 

If  T  =  FOLj,  ...,T  )  the  outermost  F  is  selected  by  £  . 

We  can  apply  Theorem  4  again  in  order  to  prove 


Corollary  3 

The  generalized  delay  rule  is  optimal  in  lang  GS  . 

Proof.  Since  the  GDR  is  simple  and  performs  at  most  one  substitution 
at  each  step,  all  we  need  to  prove  is  that  it  is  safe. 

The  proof  is  by  induction  on  \\b\|  where  B  is  any  term  in  the 
computation  lattice  of 

TQ  =  T{D/X]  by  P  . 

The  cases  B  =  or  B  =  F(B1, are  easy. 

If  B  =  G.  (B, ,  . .  .,B  )  and  j  is  the  sequentiality  index  of 

l  1  p. 


gi(b1(n)(d),...,bp>(Q)(d))  ,  then  bj  IcV^  >  fp/f2^^)  ~bj(n)(5)  b^ 
induction.  Since  bk(o)(d)  c  b^Cy^ ,  f^/f^d)  ,  the  very  definition 
of  sequentiality  gives  us  b {f^ f ^  ,  f^f2}(d)  h  bj^f^,  Q/f2}(d)  . 

□ 


Conclusion 

The  results  of  this  chapter  generalize  quite  nicely  to  a  programming 
language  where  we  introduce  assignments,  goto's  and  while  statements. 

What  is  less  clear  to  the  author  is  how  to  perform  computation  in  a 
"typeless"  recursive  language  where  procedures  can  be  passed  as  arguments, 
say  in  a  full  LISP  for  example.  It  might  also  be  interesting  to  study 
(or  prove  the  non-existence  of)  optimal  computation  rules  when  the 
simplifications  allowed  are  less  restrictive  than  the  ones  we  chose. 
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Chapter  3-  PROOFS  BASED  UPON  MONOTONICITY 

In  this  chapter,  we  investigate  how  far  into  the  theory  of 
computation  can  one  get  from  the  mere  hypothesis  that  programs 
represent  monotone  mappings  between  data-types,  thus  ignoring  continuity. 

For  this  purpose,  we  introduce  a  formal  system  in  which  the  methods 
of  "inductive  assertions"  and  "structural  induction"  for  proving 
properties  of  programs  can  be  expressed  and  justified. 

The  reader  interested  in  the  logic  developed  here  is  expected 
to  be  familiar  with  the  work  of  Milner  [19].  However,  a  detailed 
knowledge  of  the  formalism  should  not  be  necessary  for  understanding 
the  various  uses  we  make  of  it.  In  particular,  the  examples  given  are 
described  informally,  despite  the  fact  that  all  the  proofs  can  be 
expressed  within  the  logical  system. 


1.  A  Formal  System  for  the  Time  Being 
1.1  Syntax 

Terms,  which  are  meant  to  denote  monotone  functions  of  some  type, 
are  defined  as  follows: 

(i)  Typed  identifiers  are  terms.  (We  shall  almost  always  omit  the 
type  subscript.) 

(ii)  If  s  is  a  term  of  type  a  -  B  and  t  a  term  of  type  a  , 
then  s(t)  is  a  term  of  type  B  . 

(iii)  If  x  is  of  type  a  and  t  of  type  B  ,  then  [\x.tj  is  a 
term  of  type  a  -*  B  • 
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(iv)  If  P  is  a  wff,  t  a  term  of  type  a  and  x  a  variable, 

then  [  U  t]  and  [  n  t]  are  terms  of  type  a  . 

{x|p}  [x|p} 

A  well-formed-formula  P  is  a  conjunction  of  equalities  or 
inequalities  between  terms  of  the  form  pc  q,r  =  s,  ...  ,uC  t  . 

A  proof  is  a  sequence  of  implications  between  wffs  P  V  Q,  ,  each  being 
derived  from  the  preceding  implication  by  an  axiom  or  a  rule  of  inference. 

Variables  are  bound  by  \  ,  U  and  n  .  We  write  s{t/x)  and 
P(t/x]  to  denote  the  result  of  replacing  all  free  occurrences  of  x 
in  s  and  P  by  t  ,  after  renaming  the  necessary  bound  variables. 


1.2  Semantics 

A  standard  model  is  a  denumerable  family  of  complete  lattices  , 

one  at  each  type  a  .  Each  Da  has  a  minimal  element  UUa  and  maximal 

element  00  .  The  two  base  types  are  I  and  B  .  The  domain  of 
Q  true 

individuals  DT  can  be  any  complete  lattice  while  D  is  i 

1  a  false 


If  a  and  (3  are  types,  then  a  -  3  is  also  a  type  and  Da_^  is 

the  set  of  monotone  mappings  from  DQ  into  .  It  is  easily  checked 

that,  whenever  and  are  complete  lattices,  itself 

a  complete  lattice.  Terms  of  type  a  are  intended  to  denote  elements 

of  D  . 

a 


l.J  Axioms  and  Rules  of  Inference 

Here  x  ,  y  ,  z  ,  f  represent  variables  s  ,  t  terms  and  P  ,  Q,  ,  R 
wffs.  Axioms  and  rules  are  meant  at  all  syntactically  correct  types. 


. .  .  .  - "  - 


(a)  Axioms 


(Reflexivity) 

Dl: 

t-  X  C  X 

(Transitivity) 

D2: 

x  C  y  ,  ycz 

h  X  c  z 

(Antisymmetry) 

D5: 

x  c  y  ,  y  c  x 

1-  X  =  y 

X  H  y 

l-  X  c  y  ,  y  c  X 

(Minimality) 

Dl*: 

1-  UUcx 

(Maximality) 

D5: 

h  xcOO 

(Monotonicity) 

FI: 

X  c  y 

1  f(x)  C  f(y) 

(\ -conversion) 

F2: 

h  [\x.s](t)  c  s{t/x] 

(bottoms -tops) 

F5: 

1-  UU(x)  C  UU 

(joins) 

Fit: 

P{y/x] 

t-  t  [y/x]  cut 
[x|p] 

(meets) 

F5: 

?{y/x} 

V  LI  t  c  t  [y/x] 

tx|P) 

(Inclusion) 

Wl: 

P 

1-  Q,  (Q  is  a  sub -conjunct 

of  P) 


(b)  Rules  of  inference 


(Conjunction) 

Rl: 

(Cut) 

R2: 

(Substitution) 

R3: 

( Extensionality) 

Rl*: 

(Cases) 

R5: 

P  Y  Q  P  1-  R 
P  t-  Q,R 

P  I-  Q  Q,  I-  R 
P  t-  R 

P  h  Q 

P{s/x]  (■  Q{s/x] 

P  Y  f(x)  c  g(x) 

- -  f ■  —  -  (x  not  free  in  P) 

P{  false/x]  v  Q,  p{true/x)  y  Q 

pTq 


Here,  false  and  true  are  abbreviations  for  UUt,  and 
-  -  £ 

OCL.  respectively. 

.u 
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(meets) 


R6: 


Q,P  h  y  E  t 
qTTc  fi  t 
■  U|p) 


(x  not  free  in  Q.) 


(joins) 


R7: 


Q,P  h  tcy 

qI  U  tcy 

[X|P] 


(x  not  free  in  Q) 


1 A  Soundness 

In  order  to  establish  validity  of  the  axioms  and  rules  of  inference, 

one  first  ought  to  make  sure  that  terms  without  free  variables  indeed 

denote  elements  of  the  complete  lattice  of  the  corresponding  type.  This 

is  easy  for  application  and  ^-abstraction  (see  Milner  [19]) .  For 

meets  and  joins  ,  we  have  to  prove  in  essence  that  if  for  each  iel  the 

function  f.  is  monotonic  then  F  f.  and  U  f.  are  also  monotonic. 

1  iel  1  iel  1 

Let  x  c  y  •  For  all  iel  ,  we  have 

n  f.(x)  c  f  (x)  c  f.  (y)  E  IJ  fi(y)  • 
iel  1  iel 


It  follows  by  definition  of  n 
n  f.(x)  c  n  f.  (y) 

1  —  .  -r  1 

iel  iel 


and  by  definition  again 

[  n  f.  ](x)  c  [  n  f.  j(y) 
id  1  iel 

[  u  f.](x)  e  1  11  fJ(y) 

id  1  iel  1 

Using  exactly  the  same  approach  as  Milner  [ 19 ] >  one  can  ihen  6° 
through  the  axioms  and  rules  of  inference,  and  justify  their  validity. 


and  LI  that 


and  U  f.(x)  E  u  f-;  (y)  * 

iel  1  iel  x 
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1.5  Pragmatics 


We  shall  use  the  following  abbreviations: 


(l)  By  the  Knaster-Tarski  theorem,  we  can  characterize  the  least -fixpo in t 

of  \x.f(x)  as  the  greatest-lower-bound  of  [x  |  f(x)  ex}.  We  shall 

therefore  use  ^x.f(x)  as  an  abbreviation  for  fl  (x)  •  The 

[x|f(x)  c  x] 

equivalents  of  rules  Fh  and  R7  are  then: 


R8:  h  f(lix.f(x))  c  ji,x.f(x) 

R9:  f (y)  c  y  h  nx-f(x)  c  y 


The  rule  R9  was  named  fixed -point  induction  by  Park  [26]. 

We  shall  use  the  notations  f  <=  r(f)  and  f  as  alternatives 
to  [nf.T(f)]  . 


(2)  One  should  not  confuse  the  domain 
TT  FF 

data-type  .  Here 

U1  K 


true 

Dp:  i  with  the  boolean 

B  false 

should  be  interpreted  as  the 


range  of  some  semi-decision  procedure. 

Let  us  now  suppose  that  the  domain  Da  is  characterized  by  a 

semi-decision  predicate  \x.,fr(x)  mapping  into  such  that 

.fr(x)  =  false  if  and  only  if  x  =  UUa  .  We  can  then  interpret  the 

logical  formula  'tfyc.fr:  P(y)  as  fl  (P(y))  >  where  P 

[y|.fr<y)  =  true] 

belongs  to  DQ  -  DB  .  This  justifies  using  Vye.fr. P(y)  , 

or,  when  no  confusion  can  arise,  Vy-P(y)  as  an  abbreviation  for 

fl  (P(y))  .  Similarly,  3y.P(y)  will  abbreviate 

{y|.fr(y)  =  true] 

u  (P(y))  . 

frU(y)  r  true] 
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Rules  Fb,  F5,  R6  and  R7  then  translate  into  the  following  equivalent 


to  the  rules  of  first-order  logic: 


(i)  Vy*P(y)  =  true  ,  J}(a)  =  true  v  P(a)  -  true 

(ii)  P(a)  =  true  ,  3(a)  =  true  t-  3y.P(y)  =  true 


(iii)  from  Q,^(y) 
infer  Q, 

(iv)  from  Q,iKy) 
infer  Q, 


true  (■  P(y)  =  true 

I-  Vy-P(y)  =  true 

true  h  P(y)  =  false 

h  3y-P(y)  =  false 


(y  not  free  in  Q) 


(y  not  free  in  Q) 


Examples  of  Proofs 
Example  1.  The  proof  that 

[  U  f(i)](x)  =  U  f(i)(x) 

U|D  ti|l] 


is  quite  instructive,  and  we  sketch  it  here: 

First  I  t-  f(i)  c  u  f(i)  (F4) 

UlD 

I  I-  f(i)  =  [  U  f(i)](x)  (Appl) 

£  i  1 1 } 

(The  rule  (Appl)  f  eg  Y  f(x)  c  g(x)  is  derivable  from  FI  and  F2.) 

h  U  f(i)(x)c[  U  f(i) ] (x)  (R7) 

uin  ti|i) 

then  I  V  f(i) (x)  c  y  f(i)(x)  (Ft) 

"  UID 

I  h  f(i)  c  [\x.  LJ  f(i) (x)  ]  (r4) 

U|D 

h  U  f(i)  c  [\x.  U  f(i)  (x)  ]  (R7) 

UID  "  UID 


y  [  u  f(i)](x)c  U  f(i)(x) 


(Appl)  and  (F2). 


■"vVTi1- v  .  ■'7  '  >'■  ’ 7 


Example  2.  Let  us  prove  that 

(a)  p,f.s(f,f)  =  jif.B(f,nf.s(f,f)) 

(b)  nf.s(f,f)  =  jif.s(f,s(f,f))  • 

In  other  words,  we  must  establish  the  equivalence  of  the  following 
three  programs : 

f  <=  s(f,f) 
g  <-  s(g,  f) 
h  <=  s(h, s(h,h)) 

Proof  of  (a) .  Since  s(f,f)  s  f  ,  we  know  by  fixed-point  induction 

that  gcf  .  By  monotonicity  of  s  ,  this  implies  s(g,g)  c  s(g,f)  . 
Since  g  3  s(g,f)  ,  we  have  s(g,g)  eg  and  f  C  g  follows  by 
fixed-point  induction  again. 

Proof  of  (b) .  By  definition,  f  =  s(f,f)  s  s(f,s(f,f))  and  therefore, 
hef  by  fixed-point  induction. 

In  order  to  prove  that  f  Ch  ,  let  us  use  the  auxiliary  program 
k  <=  s(h,s(h,k)) 

Since  s(h,s(h,s(h,h)))  h  s(h,h)  ,  the  rule  of  fixed-point  induction 
tells  us  that 

k  c  s(h,h)  ;  (1) 

but  we  know  by  (a)  that  k  =  h  ,  and  (1)  becomes  h  c  s(h,h)  . 

By  monotonicity  of  s  ,  this  implies  s(h,h)  c  s(h,s(h,h))  which,  by 
definition  of  h  ,  reduces  to  s(h,h)  c  h  .  One  last  application  of 
fixed-point  induction  and  we  prove  f  c  h  . 
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Example  3  •  For  any  functions  s  and  t  , 


That  is  the  programs  f  <=  s(t(f))  and  g  <=  t(s(f))  are  related 
by  f  =  s(g)  and  g  =  t(f)  .  Since  f  ^  =  s(t(fg^.))  we  have 
tf  t  =  tstf  ^  and,  by  fixed-point  induction,  c  tf  ^  .  By 

symmetry  fg^  c  sf^g  hence  tfg^  c  tsf^  =  f^.g  . 

Example  .  Let  f(x)  <=  g(f(h(x),f(k(x)))  and  y  <=  g(y,y)  . 

We  prove  that  f(x)  =  y  .  Since 
g([ta.y](h(x))  ,  [kx.y ] (h(x)) )  -  g(y,y)  =  y  =  [kx.y](x)  ,  we  know 
by  fixed-point  induction  that  c  [Xx.y]  hence  f(x)  c  y  .  On 

the  other  hand,  g(f(UU), f(UU) )  c  g(f(h(uu))  ,  f(k(UU)))  by  monotonicity, 
and  g(f(UU),f(UU))  C  f(UU)  follows  from  f(UU)  =  g(f(h(UU),f(k(UU))))  . 
We  conclude  y  c  f(uu)  by  fixed-point  induction  and,  since 
f(UU)  c  f(x)  ,  we  proved  that  y  c  f(x)  . 


Example  5 .  If  the  two  functions  \f.s(f)  and  \f.t(f)  commute,  i.e., 

st  s  ts  then  Example  2  tells  us  that  fg^  2  s(fg^.)  and  2  t(f^g) 

so  that  f  c  f  .  and  f ’  c=  f  .  (We  can  say  that  f  and  f  are 
s  st;  t  st  s  t 

weakly  equivalent.) 


The  similarity  between  some  of  those  results  and  better  known 
ones  in  linear  algebra  should  not  surprise  us  since  linear  algebra 
can  be  used  as  a  model  of  our  formal  system.  The  base  domain  will 
be  the  set  of  vector-space  over  some  space  V  •  The  natural  ordering 
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is  inverted:  V1  c  Vg  holds  whenever  V 2  is  a  subspace  of  • 

The  minimal  element  UU  corresponds  to  the  space  V  itself  while 
the  vector  space  containing  only  0  corresponds  to  00  .  Linear 
transformations  over  V  are  then  monotone  mappings  in  with 

respect  to  that  ordering,  and,  if  the  dimension  of  V  is  infinite,  they 
are  not  continuous  in  general.  The  least  fixed-point  of  a  linear 
transformation  A  e  -*  is  then  the  eigenspace  of  A  having 
maximal  dimension. 

1.6  A  Possible  Weakness  of  the  System 

Let  us  consider  the  inference  rule 

P,x  C  g(x)  1-  f(x)  c  g(f(x)) 

pp  .  -  (x  not  free  in  P) 

P  y  tpc.f(x)  c  g(u,x.f(x)) 

Is  RT  provable  or  not  within  our  system?  Although  we  have  not 
been  able  to  settle  this  question,  we  shall  be  able  to  show  that  rule  RT 
must  be  valid  in  any  standard  model  of  our  formal  system. 

Before  doing  so,  let  us  point  out  that  fixed-point  induction  can 
be  derived  from  RT  and  that  using  RT  would  somewhat  simplify  the 
proofs  in  the  previous  examples.  For  instance,  the  proof  that  f  c  h  , 
where  f  =  y,x.s(x,x)  and  h  =  |ax.s(x,  s(x,x))  could  go  as  follows: 

Let  us  assume  y  c  h  and  y  C  s(y,y)  .  In  order  to  apply  rule  RT  , 
we  shall  prove  that 

ych,ycs(y,y)  H  s(y,y)  c  h  ,  s(y,y)  c  s(s(y,y),s(y,y)) 

and  therefore  conclude  that  1-  fch,fcs(f,f)  so,  a-fortiori 
h  f  c  h  . 
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By  monotonicity  y  c  s(y,y)  ^  s(y,y)  t=  s(s(y,y),s(y,y))  and 
yEs(y>y)  s(y,y)  c  s(y,s(y,y))  .  Therefore,  using  monotonicity 
three  times  again  y  c  s(y,y)  ,ych  h  y  c  s(h,s(h,h))  .  But 
h  =  s(h, s(h,h))  and,  putting  everything  together,  we  get 
ych,ycs(y,y)  V  s(y,y)  c  h,  s(y,y)  c  s(s(y,y),s(y,y))  . 

We  shall  now  justify  the  rule.  To  each  monotone  function  t 
mapping  3  -*  3  and  ordinal  number  a  ,  we  associate  an  element 
ta(uu)  €  3  as  follows : 

(i)  t°(UU)  s  UU 

(ii)  t^CUU)  =  t(ta(UU)) 

(iii)  If  a  =  lira  (3)  is  a  limit  ordinal,  ta(UU)  =  U  (tp(UU)}  . 

p  <a  p  <a 

More  concisely,  ta(UU)  =  t(  U  ft^(UTj) })  ,  if  we  agree  that  U  (0)  = 

p<a 

This  sequence  has  the  properties  that  p  <  7  implies 
t^(UU)  c  t7(UU)  c  f^  for  all  ordinals  p  and  7  ,  and  ta(UU)  5  t0^1 
implies  t  (UU)  =  for  any  ordinal  ot.  . 

Hence,  if  we  choose  a  to  be  the  first  ordinal  not  embeddable 
in  3-3>  the  sequence  t°(UU),t1(UU),  . .  .,ta(UU)  has  "too  many" 
elements  and  ta(UU)  =  f^  .  (See  Cadiou  [  2  ]  or  Hitchcock-Park  [  8  ]•) 
Now,  from  the  hypothesis  F  c  s(F)  I-  t(F)  c  s(t(F))  ,  we  can 
deduce  that,  for  all  ordinals  a  , 

ta(UU)  c  s(ta(UU))  .  (1) 

If  a  is  not  a  limit  ordinal,  (l)  is  easy  to  establish.  If  a  is  a 

limit  ordinal  a  =  lim  (p)  ,  then  for  all  p  <  a  we  know  that 

P  <a 
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■■■  mmtm . — 


t^(UU)  c  s(t^(uu))  .  Since  t^(UU)  c  ta(UU)  we  know  that 

tP(UU)  c  s(ta(uu))  and  therefore  ta(UU)=  U  {t^(UU)  }  c  s(ta(UU)) 

B  <OL 

a 

Choosing  a  such  that  t  (UU)  =  f^  then  yields  the  conclusion  of 


t^(UU)  c  s(t^(UU))  .  Since  t^(uu)  c  ta(UU)  we  know  that 

t^(UU)  c  s(ta(UU))  and  therefore  ta(UU)  =  U  {t^(UU)  }  c  s(ta(UU) ) 

p  <a 

oc 

Choosing  oc  such  that  t  (UU)  =  then  yields  the  conclusion  of 
rule  RT  . 
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2  •  Justification  of  Some  Proof  Techniques 

Suitable  choices  of  the  semantic  definition  of  programming  languages 
allow  to  reduce  most  of  the  proof  techniques  described  in  the  literature 
to  the  rule  of  fixed-point  induction.  In  particular,  this  applies  to  the 
methods  described  in  McCarthy  [13],  Naur  [2k],  Floyd  [7],  Manna  [14], 
Manna-Pnueli  [1  .6],  and  Hoar e  [9].  Since  Hoare's  technique  has  been 
justified  in  Manna- Vuillemin  [17],  and  the  connections  between  fixed-point 
induction  and  the  Manna-Pnueli  method  have  been  explicited  by  Park  [26], 
we  shall  limit  ourselves  to  first  indicating  how  the  Floyd-Naur  method 
can  be  explained  within  our  formal  system  and  then  sketch  the  connections 
with  structural  induction.  The  basic  ideas  in  this  section  are  from  Park  [26]. 


2.1  Description  of  a  Flowchart -language 


A  flowchart  is  a  connected  graph,  with  two  distinguished  nodes 


C^START^])  and  C^HALf"^  .  Nodes  can  be  of  the  type  assignment 


Jl 


\  -  F(X) 


or  test 


Following  Floyd  [  7  ],  the 


"meaning  assigned"  to  such  a  program  will  be  a  relation  \|r(x^.)  over 

the  values  of  the  program  variables,  at  the  C  HALT  node.  This 

output  relation  is  obtained  by  "carrying  along"  an  input  relation  <p(x  )  , 

s 


holding  of  the  prograjn  variables  at  the  START 


notation  = 


node .  The 


therefore  means  that,  whenever  we  start 


6k 


the  execution  of  /?  with  inputs  satisfying  cp  ,  the  outputs,  if  any, 
must  satisfy  \|r  . 


As  in  Chapter  2,  syntactic  objects  are  represented  by  upper-case 
letters  and  associated  semantic  objects  by  the  corresponding  lower-case 
letters . 

The  semantic  function  2  is  defined  recursively  as: 


''•5 


hi 


Equation  (iv),  expressing  the  semantics  of  goto* s,  defines  the 
"minimum  valid  inductive  assertion"  described  in  Manna  [l4].  There  will 
be  essentially  one  such  equation  per  loop  in  the  program;  this  may 
lead  to  systems  of  mutually  recursive  relations,  depending  on  the 
nature  of  nesting  of  the  loops.  According  to  this  definition,  we 
have  for  example : 


[xyl,y2*(yl-  Art(yl'y2^ 


where  t(r)(y.,,y2)  =  [  (y-^  =  0)  A  (yg  =  1)  ]  V 
[3x1,x2.(x1<  a)  A  r(x1,x2)  A  (yx  =  x^l)  A  (yg  =  (x^+1)  .x2)  ] 


Note  that,  in  order  to  simplify  our  semantic  description,  we  have  in  effect 
limited  ourselves  to  considering  a  flowchart  in  block-form.  If  loops  do 
not  have  this  nice  nested  structure,  the  description  would  be  slightly 
more  complex,  and  we  would  need  to  express  the  semantics  of  ill-nested 
loops  by  systems  of  mutually  recursive  equations . 
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2.2  The  Inductive  Assertions  Technique 


The  meaning  of  a  flowchart  program  is  now  a  (partial)  predicate, 
defined  as  the  least -fixed  point  of  some  equation,  say  r  =  t(r)  .  If 
we  can  find  an  "inductive  assertion"  q  such  that  t(q)  a  q  ,  the  rule 
of  fixed-point  induction  allows  us  to  infer  that  r ,  c  q  .  This  shows 

X  *“ 

that  whenever  the  program  terminates,  that  is,  if  rx(d)  =  true  for 
some  input  d  ,  then  we  must  also  have  q(d)  =  true  . 

This  will  be  best  understood  by  using  the  same  example  as  above: 

The  expression  t(q)  c  q  is 

[y1==0)  A  (y2  =1]  V  [2x1,x2.(x1/a)  Aqfx^Xg)  A  (y2  =x1+l)  A  (y2  =  (x^l)  .Xg)  ] 

E  <3(y1^y2)  • 

Using  the  inference  rules  corresponding  to  those  of  predicate 
calculus  in  Section  1,  this  formula  is  equivalent  to 
h  q(0,l)  =  true 

and 

q(y^y2)  A  y1  f  a  =  true  (-  q(y1+l,  (yx+l)  .y2)  =  true  . 

This  last  formulation  is  the  direct  translation  within  our  formalism 
of  the  verification  condition  derived  by  Manna  [l4].  This  justification 
of  the  method  gives  us  the  additional  insight  that  the  inductive 
assertions  one  may  use  for  proving  the  partial  correctness  of  some 
program  by  the  Manna -Floyd  method  are  exactly  the  fixed-points  of  some 
algorithmically  constructed  functional. 
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2.3  Termination  of  Programs 

Following  Park  [26],  we  shall  now  prove  that  the  rule  of  fixed- 
point  induction  allows  us  to  derive  instances  of  (mathematical)  transfinite 
induction. 

Let  be  a  domain,  and  <:  a  partial  ordering  on  Jfr  .  For  any 

true 

relation  R  mapping  into  j,  ,  let 

false 

t(R)(x)  =  [Vy-  if  y  x  then  R(y)  else  true]  .  The  least  fixed-point  of  t 
is  then  the  maximal  well-ordered  initial  segment  of  the  ordering  < 
over  3  .  (Note  that  this  is  the  first  time  that  we  use  a  monotone 
function  which  is  not  continuous.) 


Example.  Let  us  consider  some  orderings  over  the  integers,  and  the 
corresponding  Rt  . 


If 

<  is 

1  <2  <3  <  ... 

then  Rt  =  t^uu)  and  r  (n) 

holds  for 

every 

n  . 

If 

<  is 

...  <3  <2  <1 

then  R  =  uu  never  holds . 

If 

<  is 

1  <  3  <  5  ...  2 

<  b  <  ...  ,  then  R  a  t2tu(uu) 

and 

Rt(n) 

holds 

for  every  n  . 

If 

<  is 

1  <  3  <  5  ...  < 

6  <  b  <  2  ,  then  R^  =  ttu(uu) 

and 

Rt(n) 

holds  only  of  the  odd 

natural  numbers. 

If 

<  is 

1  <  3  <  5  ...  2 

<  6  <  10  <  . . .  4  <  12  <  20  «<  . . 

•  •  •  • 

,  then 

2 

Rj.  =  "t  (UU)  and  R,  (n)  holds  for  every  n  . 

□ 


Tf  <  is  a  well-founded  relation  over  ,  then  R  (x)  holds  for 

u 

any  element  x  of  J)  ,  in  which  case  the  "program”  R(x)  <=  t(R)(x) 
can  be  thought  of  as  defining  recursively  our  domain. 
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In  other  words,  if 

WO  s  ^R.\  <,x.[  (Vy)  if  y  <  x  then  R(y)  else  true]  , 

the  equality  W0(<)(x)  =  .ft(x)  characterizes  the  relation  <  as  being 
well-founded.  (See  also  Hitchcock-Park  [  8  ]  for  a  more  elegant  formula¬ 
tion  of  this  equality.) 

No  matter  what  kind  of  ordering  <  is,  fixed-point  induction 
translates  into  the  following  rule: 

[(Vy).  if  y  <  x  then  P(y)  else  true]  c  P(x)  Y  W0(  <)(x)  c  P(x) 

And  in  particular,  if  <  is  well  founded  over  Jfr  ,  then  P(x)  =  true 
will  hold  for  any  x  in  £  .  Depending  on  the  interpretation  of  <  , 
this  is  a  formulation  of  structural  induction  or  transfinite  induction 
(see  Chapter  4,  Section  3)  • 

For  example,  the  termination  of  the  program 

r 

F(n)  <=  if  odd(n)  then  n  else 

J  if  G(n)  =  1  then  F(f)  else  F(^y  •  F(n  -  +  ^y) 

G(n)  <=  if  even(n)  then  G(n/2)  else  n 

v. 

over  the  natural  numbers  can  be  established  using  the  well  ordering 
(l  <3  <5  ■<•••)  <  (2  <  6  <  10  <  ...)  <(4<12<20<...)  <(...)  ... 

More  examples  of  applications  of  this  technique  will  be  given  in  the 


next  chapter. 


Chapter  4 .  PROOFS  BASED  UPON  CONTINUITY 


The  previous  chapter  was  a  first  attempt  at  proving  properties  of 
programs,  based  on  a  rather  weak  theory  of  computation.  We  shall  now  use 
our  knowledge  that  programs  are  continuous  functions,  and  justify  some 
other  proof  techniques.  The  presentation  will  again  be  quite  informal. 
However,  it  should  soon  be  apparent  that  all  the  proofs  given  can  be 

formalized  in  Milner's  Logic  for  Computable  Functions  (LCF),  as  described 
in  Section  1  of  this  chapter. 

Obviously  we  wish  to  preserve  all  the  results  obtained  in  the 
previous  chapter.  As  far  as  formal  systems  are  concerned,  one  could 
achieve  this  by  embedding  ICF  in  the  logic  described  in  Chapter  3.  In 
this  mixed  system,  terms  would  be  (syntactically)  recognizable  as  being 
monotone  or  continuous,  and  the  appropriate  rules  of  inference  could  be 
applied  accordingly.  The  logic  would  not  be  very  different  from  the 
other  two  we  describe  in  this  work.  For  example,  a  good  candidate  for 
the  induction  rule  would  be 

T[Xlje  M.  P  ^  g(UU)  g  h(UU)  P,g(x)  c  h(x)  h  g( f (x) )  c  h(f(x) ) 

p  H  g(|ix.f(x))  c  h(nx.f(x)) 

where  x  must  not  be  free  in  P  and  g  must  be  continuous,  while  h 
and  f  only  need  be  monotone.  (This  rule  was  independently  suggested 
by  Hitchcock-Park  [8].)  its  justification  is  very  similar  to  that  of 
rule  RT  in  the  preceding  chapter. 

Remarkably  enough,  there  seems  to  be  no  real  need  to  get  involved 
in  this  rather  canplex  mixed  system;  as  long  as  all  the  terns  used  in 
the  proofs  denote  computable  functions,  any  of  the  results  of  Chapter  3 
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will  still  hold  in  LCF.  For  example,  if  we  restrict  ourselves  to  using 
only  computable  assertions,  the  inductive  assertions  method  can  be 
justified  in  exactly  the  same  way.  The  only  technique  for  which  this 
constitutes  a  real  problem  is  transfinite  induction,  and  we  shall  give 
it  special  attention  in  Section  2.1. 

1.  Description  of  I£F 

The  formal  system  that  we  shall  use  is,  except  for  some  trivial 
changes,  taken  from  Milner  [ 18 ] .  It  is  a  typed  \-calculus  version  of 
a  logic  designed  by  Scott  [30].  (We  assume  the  reader  who  is  interested 
in  the  technical  details  to  be  familiar  with  Milner's  work.) 

1.1  Syntax 

The  terms  of  the  logic  are  intended  to  denote  the  computable 
functions  of  various  types.  Each  term  should  therefore  be  subscripted 
with  its  type,  but  we  shall  almost  always  omit  this  subscript. 

Terms  are  defined  recursively  as: 

(1)  Identifiers:  g,P,F,T,a,x,y  . . .  (at  each  type)  or  constants : 

UTJ  (at  each  type)  TT,FF  (at  the  type  Boolean)  are  tenns. 

(2)  If  s  is  of  type  a  -  0  and  t  of  type  a  ,  then  s(t)  is  a 

term  of  type  p  . 

(3)  If  s  is  of  type  a  ,  and  x  of  type  p  ,  then  [\x.s]  is  a 

term  of  type  p  -*  a  . 

(k)  If  p  is  of  type  boolean,  s  and  t  of  type  a  ,  then 

if  p  then  s  else  t 


is  a  term  of  type  a  . 


(5)  If  f  and  s  are  of  type  a.  ,  then  [pf.s]  is  a  term  of 
type  a  . 

As  an  alternative  to  [  jj,f .  s  ]  ,  we  shall  also  use  the  notations  f  , 
f  <=  -r(f)  and  t:  f  <=  s  ,  where  r  =  [Xf.s]  . 

A  wff  is  a  conjunction  of  equalities  s  =  t  or  inequalities  set 
between  terms,  separated  by  commas. 

A  proof  is  a  sequence  $  h  YQ  >  •  •  •  >  $n  h  Yn  °f  imPlications 
between  wffs,  each  of  which  is  obtained  by  application  of  the  rules 
of  inference,  or  use  of  the  axioms . 

For  any  term  s  or  wff  §  ,  we  write  s{t/x]  and  $[t/x}  to 
designate  the  result  of  substituting  t  for  all  the  free  occurrences 
of  x  in  s  and  $  .  An  occurrence  of  x  is  not  free  if  it  is  bound 
by  Xx  or  ^x  . 

1.2  Axioms  and  Rules  of  Inference 

In  this  description,  x  ,  y  ,  z  ,  f  denote  variables,  s  and  t 
terms,  P  ,  Q  ,  R  wffs. 

(a)  Axioms 

About  the  Domains 


(Reflexivity) 

Dl: 

h  X  C  X 

(Transitivity) 

D2: 

x  a  y  ,  y  C  z  I-  xcz 

(Antisymmetry) 

D3: 

x  e  y , y  E  x  h  x  = y 

(Minimality) 

DU; 

1-  UUCx 
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About  the  Functions 


(Monotonicity) 

FI: 

x  c=  y  b 

f(x)  c  f(y) 

(Fixed  point) 

F2: 

b 

f(jix.f(x))  C  p,X •  f (x) 

(X-conversion) 

F3: 

b 

[Xx.s](t)  =  s{t/x} 

(bottoms) 

Fl+: 

b 

UU(x)  c  UU 

(conditionals) 

F5: 

b 

if  UU  then  x  else  y  =  UU 

b 

if  TT  then  x  else  y  =  x 

b 

if  FF  then  x  else  y  =  y 

About  Formulae s 

(inclusion)  Wl:  P  1-  Q  (Q  is  a  subset  of  P) 


1.3  Some  Remarks  About  the  Logic 


Incompleteness 

Using  the  fact  that  natural  numbers  can  be  defined  implicitly 
within  the  system,  Scott  [30]  showed  that  the  set  of  valid  implications 
P  I-  Q  is  not  recursively  enumerable,  i.e.,  the  logic  is  incomplete. 

It  also  follows  directly  from  the  undecidability  of  equivalence  between 
program  schemas  that  the  set  of  valid  theorems  H  P  is  not  recursively 
enumerable. 

On  the  other  hand,  if  we  .just  consider  terms  which  correspond  to 
lanov-schemas  (ianov  [10]),  the  logic  becomes  complete.  (This  was 
proved  independently  by  J.  W.  deBakker  and  R.  Milner.)  Another 
decidable  sub-theory  of  LCF  is  described  in  Courcelles-Kahn-Vuillemin  [3]* 


The  Induction  Rule  is  a  Generalization  of  McCarthy's  Recursion  Induction 

We  shell  use  the  fixed-point  induction  formulation  of  McCarthy's 

rule:  f(y)  cy  (•  u,x.f(x)  c z  y  .  This  rule  is  easily  derivable  from 

computation  induction.  In  order  to  show  that  computation  induction 

*/ 

cannot  be  derived  from  fixed -point  induction,-'  we  shall  exhibit  a 
theorem  of  the  logic  which  cannot  be  proved  by  fixed-point  induction. 

One  such  theorem  is : 

ct(t(x))  =  t(cj(x))  ,  ct(UU)  =  t(UU)  h  p,x.cr(x)  =  p,x.T(x) 

In  order  to  prove  that  it  cannot  be  derived  using  only  fixed-point 
induction,  notice  that  after  removing  the  induction  rule,  neither  the 


More  precisely,  if  we  replace  the  induction  rule  of  LCF  by  fixed-point 
Induction,  the  set  of  theorems  of  this  modified  logic  is  a  strict 
subset  of  the  theorems  of  LCF. 
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axioms,  nor  the  inference  rules  require  continuity  in  order  to  be 
valid.  We  can  thus  define  the  following  countermodel: 

Terns  will  denote  the  hierarchy  of  monotone  functions  constructed 
over  the  following  base  domain: 

d 

\  fy 


a0  =UU 


The  counterexample  to  our  theorem  is  provided  by  the  functions  f  and  g 
defined  by 

f(a.)  a  g(a.)  =  ai+1  ;  f(a)  =  f(b)  a  b  ; 

f(c)  =  f(d)  =  g(b)  =  g(d)  s  d  ;  g(a)  =  g(c)  =  c 

These  two  functions  satisfy  the  hypothesis  but  not  the  conclusion  -- 
f(UU)  =  g(UU)  ,  fg  =  gf  while  pxf(x)  ^  ^xg(x)  —  of  our  theorem, 
which  is  therefore  not  provable  within  this  system.^  Actually,  the 
same  example  can  be  used  to  prove  that  rule  RT  (see  Chapter  3, 

Section  1.6)  is  also  less  powerful  than  computation  example. 

The  theorem  is  in  itself  an  interesting  one  and  gives  in  some  cases 
an  elegant  way  for  proving  equivalence  between  programs.  For  example, 
the  functionals 

*7  7  - - - - - - - - 

With  some  slight  changes,  this  counterexample  can  be  used  to  answer  a 
question  raised  by  Scott  [30]. 
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P1(F)(x,y)  =  if  x  =  0  then  y  else  F(x-l,y+l) 

P2(F)(x,y)  5  if  x  =  0  then  y  else  F(x-l,y)+l 

and 

Pj(F)(x,y)  -  if  x  =  y  then  y  else  x.F(x+l,y) 

(F) (x^y)  -  if  X  =  y  then  x  else  y.F(x,y-l) 

over  the  natural  numbers  are  such  that: 

P1(UU)  “  P2(UU)  '  P1P2  s  P2P1  and  pj(uu)  =  P4(UU)  > 

The  proofs  of  equivalence  between  F  <=  P_l(F)  ,  F  <=  P  (F)  anc 
F  <=  P^(F)  ,  F  <=  Pj+(F)  respectively  then  follow. 

l.P  Some  Examples  of  Proofs 

In  order  to  demonstrate  some  practical  aspects  of  the  method,  we 
shall  present  some  examples  of  proofs  by  computation  induction. 

To  improve  readability,  the  following  conventions  will  be  adopted 
from  now  on: 

(1)  We  shall  omit  the  proofs  that  f  ( . . .  ,UU, . . . )  =  UU  whenever  they 
are  straightforward. 

(2)  We  shall  use  freely  the  equality 

f( . . .,  if  p  then  a  else  b,  . . .)  s  if  p  then  f( . . .,a, . . .) 

else  f( . . .,b, . . .) 

whenever  it  is  easy  to  establish  that  f( . .  .,UU,  . . .)  h  uu  . 

(3)  In  the  arguments  by  cases  on  some  variable  p  ,  we  shall  omit  the 
case  p  =  UU  whenever  it  causes  no  problem. 
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(4)  We  shall  use  the  parallel  induction  rule  for  systems  of  mutually 


recursive  definition.  Let  us  describe  the  situation  on 
f  F  <=  <t(F,G) 

the  example  (  ,  the  generalization  to  more  complex 

^G<=t(F,G) 

systems  being  straightforward.  The  rule  we  wish  to  use  is 


P  I-  QfUU/x}  [UU/y] 


(x  ,  y  not  free  in  P) 


Actually,  a  more  accurate  notation  would  be  F  =  (j,f  .cr(f,  p,g.T  (f,  g)) 
and  G  s  pg.T^f .cr(f,g),g)  • 

The  justification  of  this  rule  in  the  general  case  can  be 
found  in  deBakker -Scott  [6]  or  Hitchcock- Park  [8]. 

If  F  and  G  happen  to  have  the  same  type,  we  can  also  use 
the  following  more  intuitive  justification  of  the  rule: 

Using  the  pairing  function  tt  5  \x,y.[\p.if  p  then  x  else  y]  , 
we  can  define  ?  =  tt(F,G)  •  The  ccmponents  are  then  retrieved  as 
F  =  ?(TT)  and  G  =  ?(FF)  ,  and  can  be  defined  by 
5*  <=  n(c(3:(TT),?(FF)),T(?(TT),3:(FF)))  .  The  previous  rule  is 
then  a  direct  translation  of  the  ordinary  computation  induction 
as  applied  to  °f  . 

(5)  For  all  the  examples  where  computations  are  meant  over  some 
specific  data-type  —  integer,  natural  numbers, 
sets,  lists,  etc.  ...  —  we  assume  implicitly  that  the  axioms  for 
the  corresponding  data-types  are  put  as  premises  of  the 
Ways  to  axiomatize  those  various  domains  are  described  in 
Milner -Weyrauch  [21]  and  in  Newey  [25]. 
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Now  that  equation  (a)  has  been  proved,  let  us  consider 

Tm(fT  —  p(x)  then  x  else  ^  h(x) 

n  Tn 

=  if  p(x)  then  x  else  f  h(x)  (by  (a)) 

n 

if  p(x)  then  x  else  f^1  h(x)  (by  (a)  again) 

n 


5  Tn(fT  Mx)  s  fT  (x)  • 

n  n 

It  follows  by  fixed-point  induction  that  f  c  f  and  by  symmetry 

m  Tn 

f  c  f 
T  -  T 

n  m  r— i 


Example  2 .  Let  us  consider  the  two  "squaring"  programs 
t:  F(x,y,z)  <=  if  x  =  0  then  y  else  F(x-l,y+z,z) 

and 

o:  G(x,y)  <=  if  x  =  0  then  y  else  G(x-l,y+2x-l)  , 

over  the  natural  numbers.  We  wish  to  show  that  fT(x,0,x)  =  g  (x,0) 

Let  P(f,g)  be  f(y ,x(x-y),x)  =  g(y,x2-y2)  .  if  we  can  prove 
f(fT>gCT)  *  the  desired  conclusion  will  follow  by  choosing  x  equal  to  y 

Pase  Proving  P(UU,UU)  is  straightforward. 

Induction  Assuming  P(f, g)  ,  consider 

t(f) (y,x(x-y),x)  =  if  y  =  0  then  x(x-O)  else  f (y-l,x(x-y)+x,x) 

(definition  of  t) 

s  if  y  =  0  then  x2  else  f (y-l,x(x-(y-l) ) ,x) 

=  if  y  =  0  then  x2-02  else  g(y-l,  (x2-y2)+2y-l) 

(induction  hypothesis) 

=  CT(G) (y,x2-y2)  . 
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Example  3 .  (S.  Ness)  Let  us  consider  the  following  two  LISP 

functions 

F(x)  <=  if  atom(x)  then  x.NIL  else  F(car(x))  *F(cdr(x)) 

and 


G(x,y)  <=  if  atom(x)  then  x.y  else  G(car(x) ,G(cdr(x) ,y) )  , 


where  *  represents  the  append  function.  V/e  shall  prove  by 
computation  induction  that  G(x,y)  =  F(x)*y  (over  the  domain  of  lists) . 


Base  The  equality  UU  =  UU*y  is  a  consequence  of  the  definition 

of  *  . 


Induction  If 

A(x,y)  5  (if  atom(x)  then  x.NIL  else  f(car(x))  *  f(cdr(x)))  *y  , 
then 

A(x,y)  =  if  atom(x)  then  (x.NIL)*y  else  (f(car(x))  *f(cdr(x)))  *y 
=  if  atom(x)  then  x.y  else  f(car(x) )*(f(cdr(x) )*y) 

(LISP  axioms) 

The  conclusion 

A(x,y)  s  if  atom(x)  then  x.y  else  g(car(x) ,g(cdr(x) ,y) ) 
follows  then  by  using  the  induction  hypothesis  twice. 

□ 


2  •  Modelling  Some  Proof  Techniques  Within  I£F 


Looking  back  at  Chapter  3,  we  realize  that  Section  2.3  on  termination 
of  programs  is  the  only  place  where  we  actually  used  functions  which  are 
not  continuous.  We  therefore  have  to  demonstrate  how  the  technique  of 
structural  induction,  as  described  for  example,  in  Burstall  [1]  or 
Manna -Ness -Vuillemin  [15]  can  be  modelled  within  LCF. 

Finally,  a  method  which  was  not  accounted  for  in  Chapter  3,  since 
its  justification  requires  continuity,  is  that  of  Morris  [23]  and  we 
shall  study  it  in  Section  2.2. 

2.1  Structural  Induction 

Actually,  the  word  structural  induction  covers  two  rather  different 
techniques .  The  first  one  is  a  simple  generalization  of  the  induction 
principle  on  natural  numbers,  while  the  other  one  is  a  statement  of 
Noetherian  induction  applied  to  arbitrary  well-founded  sets,  which  is 
the  most  general  induction  principle  known  to  man. 


Sample  Structural  Induction 
(a)  Mathematical  Induction 

The  usual  formulation  of  this  principle  for  natural  numbers  is: 

from  p(0)  and  Vx(p(x)  =»p(x+l)) 
infer  yxp(x) 

Let  the  predicate  n(x)  <=  if  x  =  0  then  TT  else  n(x-l)  characterize 
the  natural  numbers  in  our  system.  (We  assume  the  usual  axioms  about 
0,1,=,  +  ,-  as  described  in  Newey  [25].)  Let  p(x)  be  any 
predicate  which  can  be  expressed  as  a  term  of  the  ^-calculus. 
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From  the  premises 


p(x)  c  TT  ,  if  x  =  Q  then  TT  else  p(x-l)  c  p(x) 
we  can  infer  by  fixed-point  induction  that  n(x)  c  p(x)  ,  i.e.,  that 
p(x)  holds  xo  >'  any  natural  number  x  . 

In  other  words, 

from  p(0)  =  TT  and  p(x)  =  TT  h  p(x+l)  =  TT 

infer  n(x)  =  TT  t-  p(x)  =  TT 

This  method  applies  to  any  data-type  which  is  recursively  defined  by  a 

* 

semi -computable  predicate.  For  example,  the  domain  2  of  words  over 
some  vocabulary  2  can  be  characterized  by 

word(x)  <=  if  x  =  A  then  TT  else  word(t(x)) 
and  the  corresponding  principle  is: 

from  if  null(x)  then  p(A)  else  p(t(x))  s  TT  h  p(h(x)-t(x))  =  TT 
infer  word(x)  =  TT  h  p(x)  s  TT  . 

(We  are  again  assuming  axioms  about  A  ,  =  ,  •  ,  h  ,  t  .) 

Example  4.  Let  us  consider  two  programs  for  computing  the  factorial 
function : 

F(x)  <=  if  x  =  0  then  1  else  x  x  F(x-l) 

G(x,y)  <=  if  x  =  y  then  1  else  (y+l)  xG(x,y+l)  . 

In  order  to  show  that  G(x,0)  =  F(x)  ,  we  shall  prove  that  n(x-y)  c  p(x,y) 
where  p(x,y)  is  G(x,y)  xF(y)  =  F(x)  .  Let  r  be  defined  as 
r(x,y)  <=  if  x  =  y  then  TT  else  r(x,y+l)  . 
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We  first  prove  that  r(x,y)  =  n(x-y)  .  Then,  since 


p(x,y)  s  if  x  =  y  then  F(x)  =  F(y)  else  (y+l)G(x,y+l) .F(y)  =  F(x) 
=  if  x  =  y  then  IT  else  p(x,y+l) 

we  can  conclude  that  r(x,y)  c  p(x,y)  ,  i.e.,  n(x-y)  c  p(x,y)  .  This 
last  inequality  is  equivalent  to  y  <  x  =  TT  H  p(x,y)  =  TT  . 

□ 


This  technique  required  p  to  be  a  computable  predicate;  if  P 
is  an  arbitrary  well-formed- formula,  a  generalization  (Milner  [  18  ] ) 
yields : 


Q  l  P(0/x}  Q,F  h  P((x+l)/x] 
Q  H  n(x)  =9  P 


(x  not  free  in  Q) 


where  q  =»  s  e  t  means  if  q  then  s  else  UU  c  if  q  then  t  else  UU  , 
and  q  =»  wi,w2  means  q  ,  q  =>  wQ  . 


Example  5 ■  Let 

rev(x)  <=  F(x,A) 

F(x,y)  <=  if  x  =  A  then  y  else  F(t(x) ,h(x) -y) 

In  order  to  show  that  rev(rev(x))  =  x  ,  one  can  prove  that  word(x)  =*  P  , 
where  P  is  rev(F(x,y))  =  F(y,x)  .  n 


(b)  Course  of  Values  Induction 

Another  formulation  of  the  induction  principle  over  the  natural 
numbers  is  the  following: 

from  Vx[Vy[y  <  x  =»  p(y)  ]  =>  p(x)  ] 
infer  Vxp(x) 
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Whenever  p  is  computable,  this  course  of  value  induction  can  also  be 
modelled  directly  because  the  operation  of  bounded  quantification  is 
computable  and  can  be  defined  as: 

V  =  (j,f.[\x,p.  if  x  =  0  then  TT  else  if  p(x-l)  then  f(x-l)  else  UU]  . 
According  to  this  definition,  V(x,p)  "means"  Vy(y  <  x  p(y))  •  We 
can  define  the  partial  predicate  m  =  p,p.\x[V(x,p)  ]  and  prove  that 
m  =  n  where  n  =  p,f.[\x.  if  x  =  0  then  TT  else  f(x-l)  ]  as  follows. 

(i)  men. 

V(x,n)  =  if  x  =  0  then  TT  else  if  n(x-l)  then  V(x-l,n)  else  UU 

C  if  x  =  0  then  TT  else  n(x-l) 

(by  cases  using  the  fact  that  V(x-l,n)  c  TT  ) 

=  n(x)  . 

Hence,  men  follows  by  fixed -point  induction. 

(ii)  n  e  m  . 

Since  x  =  0  =  FT'  h  m(x-l)  =  V(x-l,m)  by  definition  of  m  ,  we 
have  x  =  0  =  FF  V  (if  m(x-l)  then  V(x-l,m)  else  UU)  =  m(x-l)  (by 
cases  again,  using  the  fact  that  m(x-l)  COT).  It  follows  that 

m(x)  =  if  x  =  0  then  TT  else  if  m(x-l)  then  V(x-l,m)  else  UU 

=  if  x  =  0  then  TT  else  m(x-l)  . 

The  conclusion  n  c  m  then  follows  by  fixed-point  induction  again. 

□ 

Having  established  the  equivalence  n  =  m  ,  we  can  justify  the 
following  rule  of  inference: 


8h 


from  Y(x,p)  TT  h  p(x)  TT 
infer  n(x)  -  TT  I-  p(x)  TT  . 

A  similar  rule  can  be  derived  for  well-formed-formulas . 

Example  6.  Let  us  consider  a  modified  version  of  McCarthy's  91-function: 

F(x)  <=  if  x  <  0  then  x+1  else  F(F(x-2)) 

In  order  to  prove  that  n(x)  =  TT  I-  (F(x)  =  0)  =  TT  ,  let  p  =  \x.[F(x)  =  0] 
The  equalities  (F(0)  =  0)  s  TT  and  (F(l)  =  0)  =  TT  have  to  be  checked 
first  and  then,  assuming  V(x,p)  s  TT  and  x  >  1  =  TT  ,  we  prove  p(x)  : 

p(x)  =  (F(x)  =  0)  5  (F(F(x-2))  =0)  (x  <  0  =  FF) 

=  (F(0)  =  0)  (p(x-2)  =  TT) 

=  TT  (separate  check) 

□ 


Transfinite  Induction 

Let  <  be  a  well-founded  relation  over  the  domain  3  •  We  showed 
in  Chapter  5  how  to  derive  the  following  principle: 
from  Vxe^{Yye^[y  -<  x  =>  p(y)  ]  =»  p(x)  } 
infer  Yxe^{p(x) } 

The  proof  given  precluded  continuity  and  is  therefore  not  applicable  in 
the  present  context . 

We  shall  describe  a  technique  for  deriving  in  DCF  any  instance  of 
the  above  rule  one  may  need  in  "practical"  cases.  Here,  a  "practical" 
well-founded  relation  is  either  one  of  the  basic  orderings  described  in 
the  preceding  section  or  an  ordering  constructed  as  a  well-founded 
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collection  of  well-founded  relations.-/  Since  we  already  know  how  to 
handle  the  "base"  case,  all  we  need  to  model  is  the  construction  of 
complex  orderings  from  simpler  ones. 

Let  <  be  a  computable  well-founded  relation  over  the  recursive 
1 

domain  A  ,  and,  for  any  xeA  ,  let  <  be  a  well-founded  relation 

j.  ±  x 

over  j>2(x)  .  We  then  consider  the  domain  3  =  {(x,y)  jxe^  ,  ye3  (x)} 
together  with  the  ordering  <  where  (x,y)  <  (x',y')  is  equivalent 
or  (x  --  x' )  a  (y  <  y' )  .  Assuming  we  already  know  that 

Q,x'  ■<  x  =»  p{x'/x}  y  p 

Q  h  ^(X)  *  P -  ^  x'  **ee  in  Q) 

Q*y'  <y  ^  P{y'/y}  (•  P 
x 

Q  (•  ^2(x,y)  *  p -  (y  ^  y'  ^ee  in  Q) 

are  valid,  we  want  to  justify  the  rule 

/*\  §>(x'>y')  <  (x,y)  =»  P{x’/x}{y'/yl  y  p 

K  }  Q  (•  iKx,y)  -  P -  (x’x’/y  fnd  y 

free  in  Q,) 

where  3(x,y)  =  ^(x)  A  ^2(x,y)  .  Assuming  rules  (1)  and  (2)  and  the 
hypothesis  of  rule  (3),  we  shall  prove  that  Q  h  A^(x)  A  <&2(x,y)  =>  P 
in  two  nested  inductions,  by  distinguishing  between  the  following  cases: 

*7  "  "  ’  “  - - - 

This  is  equivalent  to  multiplying  the  corresponding  ordinals.  The 
operation  corresponding  to  ordinal  exponentiation  can  be  modelled 
just  as  well,  although  we  could  never  find  any  practical  application 
for  it. 


to  x  <  x' 
1 

the  rules 


(1) 


and 


(2) 


iOfidUiWaMi 


1)  x'  <  X  2  TT  . 

1 _ 

The  hypothesis  of  (3)  is  then  Q,x'  <  x  p{x'/x}  h  p  ; 
hence  rule  (1)  implies  that  Q  h  .^(x)  =s  p  and,  a-fortiori,  =*  P  . 

2)  x'  <  x  =  FF  . 

1 _ 

oince  (x,x')  <  (y,y')  TT  is  the  only  interesting  case,  one 
can  assume  that  x  =  x'  and  y*  <  y  .  The  hypothesis  of  (3)  then  becomes 
Q>y'  <  y  =9  P{y'/y}  h  P  Which,  by  rule  (2),  implies  that 

Q  h  £2(x>y)  *  p  and  -he  conclusion  Q  h  .£(x,y)  =>  p  then  follows. 

J 

Example  7.  Using  the  technique  we  just  described,  we  shall  prove  that 
Ackermann's  function 

A(x,y )  <=  if  x  =  0  then  y+1  else 

if  y  =  0  then  A(x-l,l)  else  A(x-1, A(x,y-l) ) 
is  defined  over  the  natural  number. 

Let  P  be  n(y)  c  n(A(x,y))  ,  where 
n  -  |j,f.[\x.  if  x  -  0  then  TT  else  f(x-l)  ]  .  We  shall  prove  that 
n(x)  h  p  which  "means"  that,  whenever  x  and  y  are  natural  numbers, 
A(x,y)  must  also  be  a  natural  number ,  is  true. 

The  main  proof  is  by  induction  on  x  . 

-B?se:  x  =  ?  •  In  this  case,  P{o/x}  is  n(y)  c  n(y+l)  which  is 
always  true,  as  a  consequence  of  the  axioms  about  0,1  and  + 

Induction.  Assuming  P{x-l/x}  ,  that  is  n(y)  Cn(A(x-l,y))  we  must 

prove  p  ,  i.e.,  n(y)  c  n(A(x,y))  .  Let  us  argue  by  cases  on 
the  predicate  y  =  0  : 
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Since  in  this  case  A(x,y)  =  A(x-i, 1)  ,  it 


case  y  =  0  =  TT  . 
is  sufficient  to  prove  that 

n(0)  c  n(A(x-l, 1))  .  (a) 

We  know  by  the  induction  hypothesis  that  n(l)  c  n(A(x-l, 1))  and 
equation  (a)  follows,  since  n(0)  s  n(l)  . 

case  y  =  0  s  FF  .  Choosing  y  =  A(x,y-1)  in  the  induction 
hypothesis  P{x-l/x}  gives  us: 

n(A(x,y-l))  c  n(A(x-l,A(x,y-l))  . 

Since  in  this  case  A(x,y)  =  A(x-l,A(x,y-l) )  tho  last  inequality 
implies  that  n(A(x,y-l))  cin(/(x,y))  .  Hence,  by  a  "nested" 
fixed-point  induction  applied  to  the  predicate  q(y)  =  n(A(x,y)) 
we  conclude  that  n(y)  c  n(A(x,y))  . 

□ 


2.2  Truncation  Induction 

Recalling  Kleene's  first  recursion  theorem,  we  can  characterize  the 
least  fixed-point  of  the  program  F  <-  t(F)  as  the  least  upper  bound 
of  the  sequence  of  functions  fQ,  f^,  . . . ,  f  , . . .  defined  by  fQ  =  UU 
and  f  .  =  t  ( f  )  .  The  rule  of  truncation  induction,  as  Morris  [25] 
named  it,  can  be  formulated  as 

Rule  TI 

frcm  Q,  I-  P{fn/f}  for  any  natural  number  n 

infer  Q,  l-  P[f  /f)  . 

Actually  Morris  [25]  used  the  formulation 

from  Q,Vm(m<n  =>  P[fm/f})  I-  P{f /f] 
infer  Q,  I-  p[fn/f]  for  all  n 


which  is  equivalent  to  ours  since  Section  2.1  of  this  chapter  shows 
how  to  obtain  the  missing  step,  namely: 

from  Q,Vm(m  <  n  =»  r{f  /f}  h  Pf  f  /f}  for  all  n 

infer  Q  t-  Pff  /  f]  for  all  n  . 

-  -  n'  J 

A  first  problem,  which  arises  with  rule  TI  is  that,  since  it 
requires  knowledge  about  the  integers  in  its  formulation,  it  cannot 
even  be  expressed  in  pure  LCF.  (This  should  be  regarded  as  an  advantage 
of  Scott's  formulation  of  the  rule.) 

More  dramatic  is  the  fact  that,  even  in  an  LCF  with  integers 
(where  TI  can  then  be  expressed),  there  does  not  seen  to  b^  any  way  to 
.justify  it,  despite  the  fact  that  it  is  clearly  valid  in  any  standard 
model.  It  is  possible  to  get  around  this  difficulty  by  slightly  extending 
the  logic  .  What  is  needed  is  a  formal  way  to  talk  about  limits .  This 
can  be  achieved  by  embedding  data -types  into  complete  lattices,  thus 
going  back  to  the  original  definition  of  data -types  in  Scott  [29].  This 
idea  entails  the  following  extensions  to  LCF: 

(1)  Introduce  constant  terms  00  (for  overdefined)  at  each  type.  The 

corresponding  axioms  arc  (■  ::  z:  00  and  h  00  c  00(x)  .  In 

the  case-rule,  the  case  P(00/x}  1-  Q,  should  be  added  to  the  pranise. 

(2)  Tf  c  and  t  are  terms  of  type  a  ,  then  sup( s,t)  should  also  be 
a  term  of  type  «  .  It  Is  axiomatised  by  t-  x  =  sup(x,y)  , 

l-  y  'z  sup(x,y)  and  x  C  2  ,  y  :z  s  I-  sup(x,y)  c  z  . 

(5)  We  could  introduce  inf (x,y)  in  the  same  way,  although  we  won't 
need  it .  Also,  one  should  na/.e  up  his  mind  as  to  what 
if  00  tnen  x  else  y  ought  to  mean.  Two  extreme  possibilities  are 
I-  if  no  then  x  else  y  or  h  if  00  then  x  else  y  ==  sup(x,y)  . 
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In  this  extended  logic  (along  with  the  natural  numbers)  we  can  then 
justify  rule  TI: 

First  of  all,  one  needs  to  express  the  rule  within  the  formal 
system,  and  we  shall  define  f  s  Tn(UU)  as  iter(x)(n)  where 

Definition  1. 

iter  —  p,f.[\x,n.  if  n  =  0  then  UU  else  T(f(n-l))]  . 

Using  this  definition,  it  is  easy  to  prove  that 
Lemma  1. 

iter(x)(n)  c  iter(x)(n+l) 

and 

Lemma  2. 

iter(x) (n)  c  f  . 

We  now  wish  to  prove  that  fa  U  [f  )  and,  for  this  purpose,  let 

T  n  >0  n 

Definition  2. 

U  s  p,f.[X0,n.  sup(e(n),f(e(n+l)))  ] 

Using  an  induction  on  this  formal  definition  of  U  >  one  can  then 
prove  that 

Lemma  3 • 

P(n)  c  g  1-  U(p,n)  c  g 

and 

Lemma  U . 

3(n)  c  3(n+l)  y  7(U (3,n) )  h  u(>oc.7(3(x)),n)  . 
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Note  that  Lemma  ^  is  particularly  interesting  since  it  proves  that 
any  function  y  which  can  be  expressed  within  the  logic  must  be 
continuous.  Kleene's  first  recursion  theorem  may  now  be  expressed  as 

fT  a  U(iter(T),n) 

and  proved  in  two  steps . 

Firstly,  combining  Lemmas  2  and  3  yields 

U(iter(T),n)  g  fT 


(K) 


Then,  the  otheT  hfil  f  of*  the  proof*  is  8.  little  bit  more  complies!  ed. 
T  (u(iter(t),n) )  -  Ll(\x.T ( iter(T )  (x) )  ,n)  (Lemmas  1  and  U) 


^ 'j(\x.  iter(T)  (x+1)  ,n) 
g  LI  ( iter  (t)  ,n) 


(Definition  1) 
(Lemma  1) 


The  conclusion 

fT  =  U(iter(T),n) 


follows  by  fixed-point  induction. 


We  now  have  all  the  machinery  required  for  justifying  truncation 
induction.  Assuming  for  simplicity  that  the  well-formed-fomula  we  want 
to  use  is  of  the  form  a(f)  g  g  ,  we  must  prove  that 
oc( iter(t ) (n) )  g  g  h  a(fT)  c  g  . 

Lemmas  1  and  4  tell  us  that 

U(\x.a(iter(T)(x)),n)  s  a(u(iter(T) ,n) )  , 

and  therefore 

a(iter(r )  (n) )  eg  h  a(u(iter(T)  ,n) )  g  g  . 
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Since  f^  s  U(iter(T),n)  by  Kleene's  theorem,  this  last  implication 
reduces  to 

a(iter(T)(n))  c  g  y  a(fT)  c  g 
which  is  what  we  wanted  to  prove. 

Applications 

--  First  of  all,  seme  equivalence  proofs  seem  to  be  more  natural 
(and  may  in  fact  require)  using  truncation  induction. 

For  example,  if  two  functionals  s  and  t  satisfy  c(UU)  =  t(UU) 

and  st  =  t2s  the  natural  truncation  induction  predicate  would  be 
2n-l  n 

t  (UU)  -  s  (UU)  ,  and  therefore  ^f.s(f)  s  ,±f.t(f)  .  if  one  uses 
the  machinery  we  just  developed,  this  informal  proof  can  very  easily 
be  carried  through  within  the  extended  logic .  Actually,  a  more  elegant 
proof  (not  using  natural  numbers)  would  be  the  following: 

Define 

M(g,f)(x)  <=  sug(f(x),M(g,f)(f(x))) 
and 

N(g,f)(x)  <=  su£(f(x),N(\x.g(g(x)),f)(g(x)))  . 

(  M(s,\f.f)(uu)  represents  U  sn(UU)  and  N(t,\f.f) (UU)  represents 

n  >0 

2n-l 

U  t  (UU)  .)  One  can  then  prove  that  f  s  M(s,\f  .f)  (uu)  and 
n  >0  s 

f^  s  N(t,\f.f)(UU)  and  finally  that 

s(UU)  s  t(UU)  ,\f.s(t(f)  i  =  \f.t(t(s(f)))  (■  M(s,\f.f)(UU)  h  N(s,\f.f)(UU) 

*/  — - - — - - - 

This  example  is  due  t  j  J.  W.  deBakker.  Robin  Milner  has  a  proof  of 

it  in  pure  ICF.  The  reader  may  find  out  for  himself  how  tricky  it  is, 

and  further  away  fron  the  intuitive  proof  than  the  one  presented  here. 
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—  Similarly,  let  us  consider  the  following  version  of  the  induction 

rule 
rule  R6' 

Q  ►  hcfT,p{h/f]  Q,P  i-  P(T(f)/f} 

- Q  H  p^f  - (f  not  free  in  Q) 

where  the  base  of  computation  induction  is  not  taken  at  the  undefined 

element  UU  but  at  any  element  h  c  f  . 

Informally  and  assuming  P  to  be  a(f)  c  0(f)  for  simplicity, 

the  hypothesis  of  the  rile  implies  that  a(tn(h))  c  0(*rn(h))  for  any  n  . 

On  the  other  nand,  UUchc  fT  implies  tn(UU)  c  tn(h)  c  f  and 

therefore  U  {Tn(h)}  =  f  .  The  conclusion  a(f  )  cp(f  )  then 
n  >0  T  T  T 

follows  easily  from  the  continuity  of  a  and  monotonicity  of  3  . 

This  argument  can  be  carried  through  formally  within  the  extended  LCF. 
In  particular,  it  applies  to  the  following  theorem 

y  f  cf 

r  -  a 

T(f)  c  f  y  T(c(f))  c  tT(f) 

which  is  provable  in  the  extended  logic  j  the  author  does  not  know  how 
to  prove  it  (and  conjectures  are  not  provable)  in  pure  LCF. 
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Conclusion 


In  the  actual  state-of-the-art,  Scott's  approach  to  the  semantics 
of  programming  languages  seems  to  be  the  most  promising  one.  The 
theoretical  foundations  are  sound,  and  a  natural  step  would  now  be  to 
describe  fully  the  semantics  of  a  full-size  programming  language,  along 
the  lines  of  Scott -Strachey  [32],  Milner-Weyrauch  [21],  or  Reynolds  [27]. 

Another  wide  open  and  promising  area  seems  to  be  that  of  semantics 
of  operating-systems  and  parallel  processes.  Steps  in  this  direction 
were  taken  by  Kahn  [11],  Milner  [20],  and  others. 

Finally,  the  question  of  a  "best"  logic  for  expressing  a  theory 
of  computation  remain".  As  alternatives  to  DCF,  the  systems  of 
Hitchcock- Park  [8]  and  deBakker  -  deRoever  [5]  have  some  interesting 
features;  in  an  unpublished  work,  Scott  and  Milner  axso  considered  the 
possibility  of  extending  DCF  to  a  "type-free"  logic  whose  semantic 
domali  is  one  of  Scott's  models  of  the  \-calculus. 

In  any  case,  more  efforts  should  be  put  in  studying  the  existing 
systems.  In  particular,  DCF  provides  a  nice  framework  for  the  area  of 
schematology,  where  existing  results  can  be  expressed  and  sometimes 
simplified,  and  where  new  and  interesting  questions  arise.  (See 
deBakker  [4]  and  Courcelles-Kahn-Vuillemin  [3].) 
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